pypi ultralytics 8.3.217
v8.3.217 - `ultralytics 8.3.217` Segment masks now 4× lighter with `.byte()` optimization (#22427)

one day ago

🌟 Summary

Segmentation gets leaner and more reliable: segment masks are now ~4× lighter with consistent uint8 handling, plus smoother first-iteration latency thanks to NMS warmup and a new dataloader pin_memory option. 🚀🧠

📊 Key Changes

  • Segmentation masks optimized and standardized (PR: Segment masks now 4× lighter with .byte() by @glenn-jocher)

    • Mask tensors now use .byte() (uint8) across processing and plotting, reducing memory and avoiding dtype mismatches.
    • process_mask and process_mask_native return uint8 masks instead of bool.
    • masks2segments consumes byte masks directly (no extra cast).
    • Evaluation fix: predicted masks cast to float() before IoU to prevent edge-case errors.
    • Image/tensor conversions use .byte() to ensure consistent uint8 NumPy output.
    • See details in the current PR: Segment masks now 4× lighter with .byte() optimization.
  • Dataloader and backend improvements (PR: Add NMS warmup for clearer post-processing latency by @Y-T-G)

    • New pin_memory parameter in build_dataloader(..., pin_memory: bool = True).
    • Validation sets pin_memory=self.training, reducing host memory pinning during eval by default.
    • Autobackend now warms up Non-Max Suppression (NMS) after the first forward pass for smoother post-processing latency.
    • More in the PR: Add NMS warmup for clearer post-processing latency.
  • Version bump to 8.3.217.

🎯 Purpose & Impact

  • Faster, lighter segmentation workflows 💾⚡

    • ~4× smaller mask tensors reduce memory footprint and can improve throughput on large-batch or high-resolution segmentation tasks.
    • Fewer dtype conversions in the critical path minimize overhead and potential inconsistencies.
  • More robust and consistent results ✅

    • Unified dtype handling (uint8 for images/masks, float for IoU) reduces dtype-related bugs and evaluation edge cases.
  • Smoother first-iteration performance 🚀

    • NMS warmup eliminates “cold start” spikes in post-processing latency on supported devices.
  • Better memory control for training/eval 🧰

    • The new pin_memory flag allows fine-grained control to balance throughput and system memory behavior; disabled by default in validation for stability.

Minimal examples:

  • Enable/disable pinned memory:
    from ultralytics.data.build import build_dataloader
    # dataset = ...
    dl = build_dataloader(dataset, batch=16, workers=8, pin_memory=False)
  • NMS warmup happens automatically during backend warmup; no code changes required.

Contributors: @glenn-jocher, @Y-T-G, @Laughing-q

Links:

  • Segment masks now 4× lighter with .byte() optimization (PR #22427)
  • Add NMS warmup for clearer post-processing latency (PR #22425)

What's Changed

  • Add NMS warmup for clearer post-processing latency by @Y-T-G in #22425
  • ultralytics 8.3.217 Segment masks now 4× lighter with .byte() optimization by @glenn-jocher in #22427

Full Changelog: v8.3.216...v8.3.217

Don't miss a new ultralytics release

NewReleases is sending notifications on new releases.