ultralytics 8.4.72 on Python PyPI

🌟 Summary

v8.4.72 is a small but important stability release that mainly fixes a TensorRT INT8 export crash on some RTX GPUs 🚀, while also improving export environment reliability and cleaning up docs/CI.

📊 Key Changes

Fixed TensorRT INT8 export crashes on certain RTX cards 🔧
- The most important change in this release is from PR #24876 by @glenn-jocher.
- Exporting models with format="engine", int8=True could fail on some RTX GPUs, including cases where ONNX Runtime exposed both TensorRT execution providers.
- Ultralytics changed the INT8 calibration step to use CUDA with CPU fallback instead of triggering conflicting TensorRT provider combinations.
Improved TensorRT/ONNX export image compatibility for CUDA 12 🐳
- PR #24877 pins onnxruntime-gpu to below 1.27.0 in the export Docker image.
- This avoids breakage caused by newer ONNX Runtime GPU builds that now expect CUDA 13, while the export image still uses CUDA 12.8.
GitHub Actions checkout updated from v6 to v7 ⚙️
- PR #24873 refreshes CI workflows to a newer actions/checkout version.
- This is a maintenance update for the project’s automation pipeline.
Documentation updates and cleanup 📚
- PR #24863 refreshes the Queue Management guide with a newer YOLO26 video and updated wording.
- PR #24875 simplifies the Triton C++ example README by removing extra contributor footer content.
- PR #24851 adds clearer page titles for account settings and YOLO configuration docs, helping site clarity and SEO.

🎯 Purpose & Impact

More reliable INT8 TensorRT exports on RTX hardware 💡
- Users exporting optimized TensorRT engines should see fewer failures on affected NVIDIA RTX systems.
- This is especially helpful for developers deploying compact, fast INT8 inference pipelines.
Better support for modern RTX environments 🖥️
- The fix avoids a low-level ONNX Runtime provider conflict that was causing exports to crash before the model was even fully built.
- In simple terms: INT8 export now works more reliably on hardware that previously broke unexpectedly.
Safer Docker-based export workflows 🛡️
- Pinning onnxruntime-gpu prevents version mismatches that could break ONNX inference and export tests inside CUDA 12 environments.
- This should make CI and container-based deployment setups more predictable.
No major model architecture changes 📌
- This release does not introduce a new model or training feature.
- The focus is on export stability, environment compatibility, and documentation polish.
Practical benefit for users ✅
- If you use Ultralytics for training and then export to TensorRT INT8 for production, this release is worth adopting.
- If you mainly use standard training or FP16 export, the impact is smaller but still positive thanks to general reliability improvements.

What's Changed

Add https://youtu.be/TEVPiGCxB0o to docs by @RizwanMunawar in #24863
docs: small cleanup triton contributions by @onuralpszr in #24875
Pin onnxruntime-gpu<1.27.0 for CUDA 12 export image by @glenn-jocher in #24877
Bump actions/checkout from v6 to v7 in /.github/workflows by @UltralyticsAssistant in #24873
Fix duplicate title tags on Spanish account settings and YOLO config pages by @miles-deans-ultralytics in #24851
Fix TensorRT INT8 export crash on RTX cards exposing the NvTensorRTRTX EP by @glenn-jocher in #24876

Full Changelog: v8.4.71...v8.4.72

ultralytics 8.4.72 v8.4.72 - Fix TensorRT INT8 export crash on RTX cards exposing the NvTensorRTRTX EP (#24876) on Python PyPI

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

What's Changed

ultralytics 8.4.72
v8.4.72 - Fix TensorRT INT8 export crash on RTX cards exposing the NvTensorRTRTX EP (#24876)

on Python PyPI