neuralmagic/deepsparse v1.4.0 on GitHub

New Features:

OpenPifPaf deployment pipelines support (#788)
VITPose example deployment pipeline (#794)
DeepSparse Server logging with support for metrics, timings, and input/output values through Prometheus (#821, #791)

Inference speed improved by up to 20% on dense FP32 BERT models.
Inference speed improved by up to 50% on quantized EfficientNetV1 and by up to 10% on quantized EfficientNetV2.
YOLOv5 integration upgraded to the latest upstream.

DeepSparse no longer improperly detects each core as belonging to its own socket on some virtual machines, including those on OVHcloud.
When running networks with any Quantized Depthwise Convolution with a nontrivial w_zero_point parameter no longer produces an assertion failure. Trivial in this case means that the zero point is equal to 128 for uint8 data, or 0 for int8 data.
At executable_buffer.cpp (see #899), an assertion failure no longer occurs.
In quantized transformer models, a rare assertion failure no longer occurs.