🌟 Summary
Ultralytics v8.4.74 focuses on more reliable model export and quantization 🔧—especially fixing INT8 export stability on affected GPU setups and preventing flaky OpenVINO export failures on NMS-enabled models.
📊 Key Changes
-
🚨 INT8 calibration now always runs on CPU during ModelOpt export
- In the most important change from PR #24884 by @glenn-jocher, INT8 calibration for ONNX export was changed to run unconditionally on the CPU execution provider.
- This replaces the earlier GPU/RTX detection logic, which was found to be unreliable in real-world use.
- The previous approach could still trigger core dumps or uncatchable crashes on some systems due to TensorRT execution provider behavior and cuDNN ABI mismatches.
-
✅ Safer INT8 export behavior across hardware environments
- The update avoids calibration on CUDA/TensorRT during the calibration step.
- Since calibration scales are execution-provider independent, the final INT8 engine remains effectively the same.
- The tradeoff is simple: slightly slower one-time calibration, but much better export stability.
-
🛠️ Fixed intermittent OpenVINO export failures for NMS models
- PR #24883 by @glenn-jocher fixes a non-deterministic OpenVINO export issue affecting models exported with
nms=True. - Previously, OpenVINO could fail during conversion with trace-check errors like “Graphs differed across invocations!”
- The exporter now pre-traces the model before passing it to OpenVINO, avoiding OpenVINO’s internal retracing behavior that caused random failures.
- PR #24883 by @glenn-jocher fixes a non-deterministic OpenVINO export issue affecting models exported with
🎯 Purpose & Impact
-
More dependable INT8 exports on RTX and mixed-library environments 💪
- Users exporting INT8 models should see fewer crashes and failed exports, especially on systems where CUDA, TensorRT, and cuDNN versions interact badly.
- This is particularly valuable for production pipelines and automated export workflows.
-
Better stability is prioritized over calibration speed ⚖️
- Calibration may take a bit longer because it now runs on CPU only.
- But this slowdown happens only during the export/calibration step, not during normal model inference.
- For most users, that is a worthwhile tradeoff for a much more reliable export process.
-
OpenVINO exports become more consistent 📦
- Users working with OpenVINO, especially with NMS-enabled models, should experience fewer random export failures.
- This helps both local users and teams using the Ultralytics Platform for export and deployment workflows.
-
Overall release theme: reliability and smoother deployment 🚀
- This release does not introduce a major new model, but it meaningfully improves the path from training to deployment.
- If you export models to INT8 or OpenVINO,
v8.4.74should feel safer, more predictable, and easier to trust.
What's Changed
- Fix non-deterministic OpenVINO export failure on NMS models by @glenn-jocher in #24883
- Calibrate INT8 on CPU unconditionally (cuDNN-ABI-safe) by @glenn-jocher in #24884
Full Changelog: v8.4.73...v8.4.74