ultralytics 8.3.237 on Python PyPI

🌟 Summary

Ultralytics 8.3.237 adds full SAM 3 image & video segmentation support (including text & exemplar prompts and tracking), improves export behavior (ONNX FP16 on CPU, Edge TPU/IMX deps), and polishes training, validation, and docs for smoother day‑to‑day use. 🚀

📊 Key Changes

🧠 SAM 3 integration (image & video)
- New SAM 3 model builder stack (build_sam3.py) with ViT backbone, transformer encoder/decoder, text encoder, geometry encoders, and video tracker (SAM3Model, SAM3SemanticModel and SAM3-specific modules).
- SAM entrypoint now detects sam3.pt and builds the SAM 3 tracker via build_interactive_sam3.
🎛️ New SAM 3 predictors & APIs
- Added predictors and public exports:
  - SAM3Predictor – SAM3-style interactive segmentation.
  - SAM3SemanticPredictor – text & exemplar based concept segmentation on images.
  - SAM3VideoPredictor – video tracking with box prompts.
  - SAM3VideoSemanticPredictor – video concept tracking (text + boxes + masklets).
- Wired into ultralytics.models.sam.__all__ and SAM’s task_map, so SAM("sam3.pt") routes to the right predictor.
🧩 SAM pipeline upgrades (SAM / SAM2 / SAM3)
- Predictor.setup_source now accepts an explicit stride, and SAM/SAM2/SAM3 predictors use it to enforce square image sizes and consistent feature shapes.
- SAM modules updated to support SAM3:
  - More flexible MemoryEncoder and MaskDownSampler (interpolation to fixed sizes, higher-res mask handling).
  - Memory attention can accept custom attention modules; SAM3 uses RoPE-based attention and new positional utilities (get_abs_pos, concat_rel_pos).
  - SAM2Model.set_imgsz generalized (no longer hardcoded stride 16) and specialized SAM3Model added with improved mask post-processing and non‑overlap suppression.
🖼️ SAM 3 docs & usage examples
- New reference docs under docs/en/reference/models/sam/sam3/* for all SAM3 modules (encoder, decoder, geometry encoders, text encoder, tokenizer, etc.).
- docs/en/models/sam-3.md rewritten from “API preview” into concrete usage:
  - Clear warning that SAM 3 weights are not auto-downloaded – users must manually download sam3.pt from the SAM 3 repo.
  - Instructions to download the BPE vocab (bpe_simple_vocab_16e6.txt.gz) for text prompts.
  - Full Python examples for:
    - Text prompts (SAM3SemanticPredictor)
    - Box exemplar prompts
    - Reusing image features across multiple queries
    - Video concept tracking with boxes (SAM3VideoPredictor)
    - Video concept tracking with text (SAM3VideoSemanticPredictor)
    - SAM2-style visual prompts via SAM("sam3.pt") while clarifying the difference vs. concept segmentation.
🔢 ONNX FP16 export on CPU
- FP16 TorchScript (JIT) on CPU is now explicitly disallowed only for JIT: clearer warning that half=True on CPU applies only to GPU TorchScript export.
- ONNX export now supports half=True on CPU:
  - Converts model weights to FP16 using onnxruntime.transformers.float16.convert_float_to_float16(keep_io_types=True).
  - Failures downgrade gracefully with a warning instead of aborting export.
🐧 Edge TPU & IMX export dependency management
- export_edgetpu: shell apt-get calls replaced with centralized check_apt_requirements(["edgetpu-compiler"]).
- export_imx: Java installs now use check_apt_requirements() for:
  - openjdk-21-jre on Ubuntu / Debian Trixie.
  - openjdk-17-jre on Raspberry Pi / Debian Bookworm.
- check_apt_requirements() now runs apt update with check=True, surfacing update failures instead of silently ignoring them.
🔄 More flexible resume‑training overrides
- When resuming training, you can now override more runtime/logging parameters without restarting:
  - save_period, workers, cache, patience, time, freeze, val, plots.
📏 RT-DETR validation scaling fix
- Simplified RT-DETR validation transforms; removed a custom ratio_pad injection and replaced with a clean Compose([]).
- Added a no-op scale_preds() override to make scaling behavior explicit and safe for future changes.
🧭 OBB plotting robustness
- OBBValidator.plot_predictions() reworked to:
  - Accept prediction dicts instead of raw tensors.
  - Early‑return cleanly for empty predictions.
  - Use plot_images() directly, avoiding redundant xywh2xyxy conversions and mismatched formats.
📚 Data augmentation docs: scale range clarified
- scale hyperparameter doc updated from “≥ 0.0” to 0.0 - 1.0 in the guide and macro tables, matching real‑world usage and preventing unstable configs.

🎯 Purpose & Impact

🧠 Richer segmentation capabilities with SAM 3
- Unlocks concept-level segmentation: find “all persons”, “all buses”, “person with red hat”, etc., using text or exemplar boxes rather than only point/box prompts.
- Brings video concept tracking to Ultralytics: track semantics (e.g., “person”, “bicycle”) or specific instances across frames with SAM3’s memory-based tracker.
- Advanced APIs (feature reuse, semantic + instance outputs, presence scores) enable efficient pipelines and research use cases.
🧪 More predictable, robust SAM/SAM2/SAM3 behavior
- Enforcing square image sizes via shared stride handling avoids subtle spatial shape bugs and mismatches in encoders/decoders.
- Improved memory encoding and non-overlap suppression reduce spurious overlaps and noisy tracks, especially in crowded scenes.
🚀 Better export experience across devices
- ONNX FP16 on CPU lets you reduce model size and improve performance where GPU isn’t available while keeping I/O types stable.
- Centralized apt handling for Edge TPU and IMX exports is more robust and easier to debug, especially on varied Debian/Ubuntu-based systems.
🧪 Easier experiment management & training control
- Expanded resume overrides let you adapt jobs mid‑run (e.g., change workers, cache strategy, early stopping, validation frequency, plot generation) without throwing away progress.
🎯 More reliable evaluation & visualization
- RT-DETR validation now avoids fragile manual ratio_pad hacks and is prepared for future scaling logic via scale_preds().
- OBB plots are more stable, especially for empty detections or batched outputs, giving cleaner visual diagnostics.
📖 Clearer documentation & safer configs
- SAM 3 docs now match the actual shipped API and explicitly call out weight & vocab requirements, helping users get started without guesswork.
- Clarified scale augmentation bounds (0–1) help avoid extreme settings that could degrade training stability or accuracy.

What's Changed

ONNX FP16 export on CPU by @glenn-jocher in #22927
Update IMX and Edge TPU exports with check_apt_requirements function by @lakshanthad in #22925
Correct scale range in data augmentation guide by @Y-T-G in #22907
Expand overrideable arguments for resumed training by @Y-T-G in #22903
Fix box scaling in predictions.json for RTDETR by @Y-T-G in #22817
fix: 🐞 remove redundant xywh2xyxy conversion in OBBValidator.plot_predictions by @onuralpszr in #22765
ultralytics 8.3.237 SAM3 integration by @Laughing-q in #22897

Full Changelog: v8.3.236...v8.3.237

ultralytics 8.3.237 v8.3.237 - `ultralytics 8.3.237` SAM3 integration (#22897) on Python PyPI

🌟 Summary

📊 Key Changes

🎯 Purpose & Impact

What's Changed

ultralytics 8.3.237
v8.3.237 - `ultralytics 8.3.237` SAM3 integration (#22897)

on Python PyPI