ultralytics 8.3.224 on Python PyPI

🌟 Summary

Ultralytics 8.3.224 boosts segmentation performance and stability on GPUs, adds CPU-only ExecuTorch export/benchmarks for YOLO11, fixes multi‑GPU evaluation and ONNX device selection, and improves logging + docs for a smoother developer experience. 🚀

📊 Key Changes

GPU-accelerated crop_mask with device-safe logic (PR #22575) ⚡
- Ensures boxes live on the same device as masks and avoids slow Python loops on CUDA.
- Switches to loop-only when n < 50 and on CPU; vectorized paths on GPU.
Correct multi-GPU COCO evaluation jdict gather (PR #22541) 🧩
- Aggregates predictions across ranks using dist.gather_object and cleans up worker memory.
ExecuTorch export and benchmark support (CPU-only) for YOLO11 (PR #22552) 🧠
- Adds format guards (no YOLOWorldv2, no E2E, no Pose) and benchmark table entry.
Respect selected CUDA device in ONNX Runtime (PR #22546) 🎯
- Passes device_id to CUDAExecutionProvider to avoid accidental GPU 0 usage.
More reliable W&B logging (PR #22563) 📈
- Commits metrics each epoch and aligns log order for real-time, clean dashboards.
Rust ONNXRuntime example fix and dependency updates (PR #22557) 🦀
- Uses download-binaries for ONNX Runtime to reduce setup friction and version conflicts.
Massive docstring/style cleanup across codebase (PRs #22554, #22565) 📚
- Standardizes one-line summaries, arg sections, and readability with no logic changes.
Export formats updated: ExecuTorch now listed CPU=True, GPU=False in exporter matrix 🧪

🎯 Purpose & Impact

Faster, safer segmentation on GPU 💨
- Avoids device mismatch errors and slow loops; users should see snappier, more stable mask processing.
Accurate distributed validation ✅
- Multi-GPU validation now correctly aggregates predictions for COCO-style evaluation and JSON exports.
Broader deployment options with ExecuTorch 🧩
- Enables CPU-only export and benchmarking for YOLO11—handy for mobile/edge experimentation, with clear guardrails.
Correct GPU selection for ONNX inference 🔧
- Ensures the specified GPU (e.g., cuda:1) is honored, improving reliability in multi-GPU environments.
Better experiment tracking in W&B 📊
- Immediate per-epoch updates and aligned steps yield clearer, real-time dashboards with minimal overhead.
Easier Rust example usage 🛠️
- Reduces dependency issues; quicker “just run it” developer path.
Cleaner docs and IDE help ✨
- Consistent docstrings improve readability, navigation, and autocompletion across modules.

Quick tips:

Export to ExecuTorch (CPU-only) for YOLO11:

yolo export model=yolo11n.pt format=executorch device=cpu

Ensure ONNX uses the intended GPU automatically when available—no code changes needed.

What's Changed

Fix gather jdict for COCO evaluation by @Laughing-q in #22541
feat: ✨ Add ExecuTorch benchmark and update export settings and add format validation in benchmarks by @onuralpszr in #22552
Fix docstrings by @glenn-jocher in #22554
Commit Weights & Biases logs after each epoch to force dashboard update by @Y-T-G in #22563
Fixing rust example (fixes #22556) by @andrenatal in #22557
Update Google-style docstrings by @glenn-jocher in #22565
Specify device ID for ONNX inference on CUDA by @Y-T-G in #22546
ultralytics 8.3.224 Accelerate crop_mask on GPU with is_cuda check by @Laughing-q in #22575

New Contributors

@andrenatal made their first contribution in #22557

Full Changelog: v8.3.223...v8.3.224

ultralytics 8.3.224 v8.3.224 - `ultralytics 8.3.224` Accelerate `crop_mask` on GPU with `is_cuda` check (#22575) on Python PyPI

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

What's Changed

New Contributors

ultralytics 8.3.224
v8.3.224 - `ultralytics 8.3.224` Accelerate `crop_mask` on GPU with `is_cuda` check (#22575)

on Python PyPI