ultralytics 8.4.63 on Python PyPI

🌟 Summary

🚀 Ultralytics v8.4.63 mainly adds TensorRT 11 export support with FP16 and INT8 quantization via NVIDIA ModelOpt, while also expanding built-in multi-object tracking with several new tracker options and improving reliability, performance, and docs.

📊 Key Changes

🧠 Major export upgrade: TensorRT 11 support
- The standout update in this release is PR #24735 by @onuralpszr.
- Ultralytics now supports exporting models to NVIDIA TensorRT 11, which had previously broken due to API changes in TensorRT 11.
- For FP16 and INT8 export on TensorRT 11, Ultralytics now uses NVIDIA ModelOpt instead of the older TensorRT builder flags and calibrator interface.
- This keeps export working across:
  - TensorRT 7–10 with the legacy path
  - TensorRT 11 with the new strongly-typed ModelOpt path
- Tested successfully for:
  - fp32
  - fp16
  - int8
  - including INT8 with dynamic shapes on TensorRT 11 ✅
⚡ Better quantization workflow for modern NVIDIA deployment
- FP16 is now applied by baking mixed precision into the ONNX graph before engine build.
- INT8 is now applied through explicit quantization in the ONNX graph with calibration data.
- This is especially important because TensorRT 11 removed the old methods many exporters depended on.
🎯 Big tracking expansion: 4 new built-in trackers
- PR #24371 added four new multi-object trackers alongside BoT-SORT and ByteTrack:
  - OC-SORT
  - Deep OC-SORT
  - FastTracker
  - TrackTrack
- These are now documented and wired into the tracking system with YAML configs like:
  - ocsort.yaml
  - deepocsort.yaml
  - fasttrack.yaml
  - tracktrack.yaml
🏃 Tracking docs and selection guidance improved
- Tracking docs were expanded to explain the strengths of each tracker and how to choose between them.
- This makes the tracking feature much easier to use for both beginners and advanced users.
📹 Video stream loading is safer
- PR #24749 improves cleanup when stream initialization fails.
- If one stream fails while others already opened, Ultralytics now properly closes those partially-opened resources instead of leaking threads or capture handles.
⚡ AI Gym pose workflow is faster
- PR #24744 reduces repeated GPU-to-CPU syncs in the workout monitoring loop by transferring keypoints to CPU in one go instead of piece by piece.
- Same behavior, better efficiency.
🧪 Validation mixed precision handling simplified
- PR #24736 consolidates validation autocast into one cleaner scope during training validation.
- This helps keep mixed-precision behavior simpler and more robust.
📝 Documentation improvements
- Added new Rust inference documentation for running YOLO models through ONNX Runtime without Python.
- Expanded tracking docs and reference pages.
- Updated TensorRT and Jetson docs to explain TensorRT 11 behavior and DLA limitations.
- Refreshed guides like Coral Edge TPU and semantic image search.

🎯 Purpose & Impact

🚀 TensorRT 11 users can export again
- This is the biggest user-facing change in the release.
- If you deploy on modern NVIDIA systems using TensorRT 11, exports that were failing should now work again.
- This is especially valuable for production inference pipelines and edge/server deployment.
⚙️ Future-proofs NVIDIA deployment
- TensorRT 11 changed how precision and quantization are handled.
- By moving to a ModelOpt-based workflow, Ultralytics stays compatible with newer NVIDIA tooling instead of relying on removed APIs.
💾 Smaller, faster engines remain accessible
- FP16 and INT8 exports are still available even with TensorRT 11’s breaking changes.
- That means users can continue optimizing for:
  - faster inference
  - lower memory use
  - better deployment efficiency
📦 Dynamic INT8 export support is especially useful
- Supporting INT8 with dynamic shapes on TensorRT 11 can help users deploying across varying input sizes without giving up quantization benefits.
👀 Tracking becomes more flexible for real-world scenarios
- The new trackers give users more choices depending on their needs:
  - simple baseline tracking
  - crowded-scene tracking
  - appearance-aware tracking
  - occlusion-heavy tracking
- This can improve results in surveillance, sports, traffic, and retail use cases.
🔒 More stable long-running applications
- Stream cleanup improvements reduce the chance of lingering resources in apps that open cameras or network streams.
- This matters most for production or multi-stream systems.
⚡ Small but meaningful performance wins
- AI Gym and validation updates help reduce overhead and improve runtime efficiency without changing how users interact with the API.

Overall, v8.4.63 is a strong deployment-focused release 📦—with the headline improvement being restored and modernized TensorRT 11 export support, plus a major boost to tracking capabilities and several reliability/performance refinements.

What's Changed

Consolidate validation autocast scope by @glenn-jocher in #24736
Convert SAM3 docstrings to Google style by @glenn-jocher in #24741
Merge isolated_model and isolated_task_model into isolated_model_path by @Laughing-q in #24742
Avoid per-keypoint gpu sync in workout monitoring loop by @raimbekovm in #24744
New TrackTrack, FastTracker, OC-SORT and Deep OC-SORT Trackers by @onuralpszr in #24371
Reduce CI PyTorch index flakiness by @glenn-jocher in #24753
Fix Codecov OIDC uploads by @glenn-jocher in #24754
docs: 📝 Add Rust Inference documentation by @onuralpszr in #24712
Report SystemLogger disk list by @glenn-jocher in #24758
Fix instances count logging by @Y-T-G in #24738
Clean up partially-initialized LoadStreams on failure by @raimbekovm in #24749
Restructure Coral Edge TPU guide and fix FAQ Edge TPU model filename by @raimbekovm in #24745
Restructure semantic image search guide by @raimbekovm in #24746
NVIDIA TensorRT 11 support with ModelOpt FP16 and INT8 quantization by @onuralpszr in #24735

Full Changelog: v8.4.62...v8.4.63

ultralytics 8.4.63 v8.4.63 - NVIDIA TensorRT 11 support with ModelOpt FP16 and INT8 quantization (#24735) on Python PyPI

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

What's Changed

ultralytics 8.4.63
v8.4.63 - NVIDIA TensorRT 11 support with ModelOpt FP16 and INT8 quantization (#24735)

on Python PyPI