pypi ultralytics 8.3.224
v8.3.224 - `ultralytics 8.3.224` Accelerate `crop_mask` on GPU with `is_cuda` check (#22575)

latest release: 8.3.225
22 hours ago

🌟 Summary

Ultralytics 8.3.224 boosts segmentation performance and stability on GPUs, adds CPU-only ExecuTorch export/benchmarks for YOLO11, fixes multi‑GPU evaluation and ONNX device selection, and improves logging + docs for a smoother developer experience. 🚀

📊 Key Changes

  • GPU-accelerated crop_mask with device-safe logic (PR #22575) ⚡
    • Ensures boxes live on the same device as masks and avoids slow Python loops on CUDA.
    • Switches to loop-only when n < 50 and on CPU; vectorized paths on GPU.
  • Correct multi-GPU COCO evaluation jdict gather (PR #22541) 🧩
    • Aggregates predictions across ranks using dist.gather_object and cleans up worker memory.
  • ExecuTorch export and benchmark support (CPU-only) for YOLO11 (PR #22552) 🧠
    • Adds format guards (no YOLOWorldv2, no E2E, no Pose) and benchmark table entry.
  • Respect selected CUDA device in ONNX Runtime (PR #22546) 🎯
    • Passes device_id to CUDAExecutionProvider to avoid accidental GPU 0 usage.
  • More reliable W&B logging (PR #22563) 📈
    • Commits metrics each epoch and aligns log order for real-time, clean dashboards.
  • Rust ONNXRuntime example fix and dependency updates (PR #22557) 🦀
    • Uses download-binaries for ONNX Runtime to reduce setup friction and version conflicts.
  • Massive docstring/style cleanup across codebase (PRs #22554, #22565) 📚
    • Standardizes one-line summaries, arg sections, and readability with no logic changes.
  • Export formats updated: ExecuTorch now listed CPU=True, GPU=False in exporter matrix 🧪

🎯 Purpose & Impact

  • Faster, safer segmentation on GPU 💨
    • Avoids device mismatch errors and slow loops; users should see snappier, more stable mask processing.
  • Accurate distributed validation ✅
    • Multi-GPU validation now correctly aggregates predictions for COCO-style evaluation and JSON exports.
  • Broader deployment options with ExecuTorch 🧩
    • Enables CPU-only export and benchmarking for YOLO11—handy for mobile/edge experimentation, with clear guardrails.
  • Correct GPU selection for ONNX inference 🔧
    • Ensures the specified GPU (e.g., cuda:1) is honored, improving reliability in multi-GPU environments.
  • Better experiment tracking in W&B 📊
    • Immediate per-epoch updates and aligned steps yield clearer, real-time dashboards with minimal overhead.
  • Easier Rust example usage 🛠️
    • Reduces dependency issues; quicker “just run it” developer path.
  • Cleaner docs and IDE help ✨
    • Consistent docstrings improve readability, navigation, and autocompletion across modules.

Quick tips:

  • Export to ExecuTorch (CPU-only) for YOLO11:
    yolo export model=yolo11n.pt format=executorch device=cpu
  • Ensure ONNX uses the intended GPU automatically when available—no code changes needed.

What's Changed

New Contributors

Full Changelog: v8.3.223...v8.3.224

Don't miss a new ultralytics release

NewReleases is sending notifications on new releases.