pypi ultralytics 8.4.63
v8.4.63 - NVIDIA TensorRT 11 support with ModelOpt FP16 and INT8 quantization (#24735)

8 hours ago

🌟 Summary

πŸš€ Ultralytics v8.4.63 mainly adds TensorRT 11 export support with FP16 and INT8 quantization via NVIDIA ModelOpt, while also expanding built-in multi-object tracking with several new tracker options and improving reliability, performance, and docs.

πŸ“Š Key Changes

  • 🧠 Major export upgrade: TensorRT 11 support

    • The standout update in this release is PR #24735 by @onuralpszr.
    • Ultralytics now supports exporting models to NVIDIA TensorRT 11, which had previously broken due to API changes in TensorRT 11.
    • For FP16 and INT8 export on TensorRT 11, Ultralytics now uses NVIDIA ModelOpt instead of the older TensorRT builder flags and calibrator interface.
    • This keeps export working across:
      • TensorRT 7–10 with the legacy path
      • TensorRT 11 with the new strongly-typed ModelOpt path
    • Tested successfully for:
      • fp32
      • fp16
      • int8
      • including INT8 with dynamic shapes on TensorRT 11 βœ…
  • ⚑ Better quantization workflow for modern NVIDIA deployment

    • FP16 is now applied by baking mixed precision into the ONNX graph before engine build.
    • INT8 is now applied through explicit quantization in the ONNX graph with calibration data.
    • This is especially important because TensorRT 11 removed the old methods many exporters depended on.
  • 🎯 Big tracking expansion: 4 new built-in trackers

    • PR #24371 added four new multi-object trackers alongside BoT-SORT and ByteTrack:
      • OC-SORT
      • Deep OC-SORT
      • FastTracker
      • TrackTrack
    • These are now documented and wired into the tracking system with YAML configs like:
      • ocsort.yaml
      • deepocsort.yaml
      • fasttrack.yaml
      • tracktrack.yaml
  • πŸƒ Tracking docs and selection guidance improved

    • Tracking docs were expanded to explain the strengths of each tracker and how to choose between them.
    • This makes the tracking feature much easier to use for both beginners and advanced users.
  • πŸ“Ή Video stream loading is safer

    • PR #24749 improves cleanup when stream initialization fails.
    • If one stream fails while others already opened, Ultralytics now properly closes those partially-opened resources instead of leaking threads or capture handles.
  • ⚑ AI Gym pose workflow is faster

    • PR #24744 reduces repeated GPU-to-CPU syncs in the workout monitoring loop by transferring keypoints to CPU in one go instead of piece by piece.
    • Same behavior, better efficiency.
  • πŸ§ͺ Validation mixed precision handling simplified

    • PR #24736 consolidates validation autocast into one cleaner scope during training validation.
    • This helps keep mixed-precision behavior simpler and more robust.
  • πŸ“ Documentation improvements

    • Added new Rust inference documentation for running YOLO models through ONNX Runtime without Python.
    • Expanded tracking docs and reference pages.
    • Updated TensorRT and Jetson docs to explain TensorRT 11 behavior and DLA limitations.
    • Refreshed guides like Coral Edge TPU and semantic image search.

🎯 Purpose & Impact

  • πŸš€ TensorRT 11 users can export again

    • This is the biggest user-facing change in the release.
    • If you deploy on modern NVIDIA systems using TensorRT 11, exports that were failing should now work again.
    • This is especially valuable for production inference pipelines and edge/server deployment.
  • βš™οΈ Future-proofs NVIDIA deployment

    • TensorRT 11 changed how precision and quantization are handled.
    • By moving to a ModelOpt-based workflow, Ultralytics stays compatible with newer NVIDIA tooling instead of relying on removed APIs.
  • πŸ’Ύ Smaller, faster engines remain accessible

    • FP16 and INT8 exports are still available even with TensorRT 11’s breaking changes.
    • That means users can continue optimizing for:
      • faster inference
      • lower memory use
      • better deployment efficiency
  • πŸ“¦ Dynamic INT8 export support is especially useful

    • Supporting INT8 with dynamic shapes on TensorRT 11 can help users deploying across varying input sizes without giving up quantization benefits.
  • πŸ‘€ Tracking becomes more flexible for real-world scenarios

    • The new trackers give users more choices depending on their needs:
      • simple baseline tracking
      • crowded-scene tracking
      • appearance-aware tracking
      • occlusion-heavy tracking
    • This can improve results in surveillance, sports, traffic, and retail use cases.
  • πŸ”’ More stable long-running applications

    • Stream cleanup improvements reduce the chance of lingering resources in apps that open cameras or network streams.
    • This matters most for production or multi-stream systems.
  • ⚑ Small but meaningful performance wins

    • AI Gym and validation updates help reduce overhead and improve runtime efficiency without changing how users interact with the API.

Overall, v8.4.63 is a strong deployment-focused release πŸ“¦β€”with the headline improvement being restored and modernized TensorRT 11 export support, plus a major boost to tracking capabilities and several reliability/performance refinements.

What's Changed

Full Changelog: v8.4.62...v8.4.63

Don't miss a new ultralytics release

NewReleases is sending notifications on new releases.