🌟 Summary
Ultralytics v8.3.238 is a refinement-heavy release that makes SAM3 concept/video segmentation more robust, faster to work with, and easier to install, while also stabilizing model export (especially TFLite/ONNX) and polishing docs & CI workflows. 🧠🧩
📊 Key Changes
-
SAM3 architecture refactors (core of this release) 🧱
- New shared helper for SAM3 neck feature generation (
sam_forward_feature_levels) reduces duplicated code and unifies how multi-scale features + positional encodings are produced. - VL combiner and neck now reuse this helper for both SAM3 and optional SAM2 paths, simplifying the forward logic and making behavior more consistent.
- SAM3 image/video predictors now share backbone feature preparation with SAM2 (
_prepare_backbone_features), centralizing multi-prompt batching and reducing redundant computation.
- New shared helper for SAM3 neck feature generation (
-
SAM3 stability & correctness fixes 🔧
- Batching bug fix:
SAM3SemanticPredictorno longer corrupts its cached backbone features when you query the same image with different numbers of text prompts—fixes shape mismatch crashes in “encode once, query many times” workflows. - MPS (Apple Silicon) fix: SAM3 attention now enforces contiguous tensors and simplifies dtype conversions, improving reliability on Metal (MPS) backends.
- Tokenizer deps auto-install: When using SAM3 semantic/text features, missing Python packages (
ftfy,regex,iopath) are auto-checked and prompted for install instead of crashing withModuleNotFoundError. - Cleaned up SAM/SAM2/SAM3 docstrings and docs so predictors are correctly described and imports align with the public API (
from ultralytics.models.sam import ...).
- Batching bug fix:
-
SAM3 video API & docs improvements 🎥
- Official docs now use the simplified import path
from ultralytics.models.sam import SAM3VideoSemanticPredictor. - Clearer SAM3 docs: updated model size (3.4 GB), revised YOLO11 vs SAM3 efficiency comparison, and marked SAM3 as fully available from
v8.3.237.
- Official docs now use the simplified import path
-
Export & deployment reliability 🛫
- TFLite / ONNX export guards:
- FP16 conversion for ONNX is now only run when
--halfand--format=onnxanddevice=cpuare all true—prevents ONNX-specific code running for other formats and causing avoidable errors. - Fixed TFLite export dtype mismatches when combining
half=Truewithnms=True.
- FP16 conversion for ONNX is now only run when
- MNN export safety:
- MNN slow tests are skipped if
torch<1.10, and the exporter now assertstorch>=1.10for MNN to avoid runtime segmentation faults.
- MNN slow tests are skipped if
- ONNX version pin clarified: kept
onnx>=1.12.0,<=1.19.1with comments that this remains untilonnx_graphsurgeonsupports newer ONNX. - Small TensorFlow export patch to better handle helper function availability.
- TFLite / ONNX export guards:
-
CI, tooling & infra 🏗️
- CI Summary job now depends on
SlowTests, and Slack alerts will also trigger if SlowTests fail—making flaky or long-running issues more visible. - Docker runner image now runs
apt-get update, installslibicu-dev, and clears apt lists to reduce image bloat. - GitHub Actions
actions/upload-artifactbumped from v5 → v6 (Node 24 runtime).
- CI Summary job now depends on
-
Examples & ecosystem tweaks 🌍
.gitignorenow allowsexamples/**/requirements.txt; new example-specific requirements were added (e.g., for ONNXRuntime/OpenCV demos).- RT-DETR ONNXRuntime example: safer downloads, better input validation, optional NMS filter, and fixed COCO YAML URL.
- Interactive tracking UI: safer defaults (e.g.,
save_video=Falseby default).
-
Docs & UX polish 📚
- Large pass standardizing terminology (
pretrainedvspre-trained), cleaning grammar, clarifying Explorer’s deprecated status and pointing users to Ultralytics HUB. - Refreshed guides for Raspberry Pi, Jetson, Vertex AI, tracking, auto-annotation, and project-planning docs for clearer language and more realistic expectations.
- Swapped in new YOLO11 videos for VisDrone and MNN export to give users up-to-date tutorials.
- Large pass standardizing terminology (
🎯 Purpose & Impact
-
More reliable SAM3 workflows for research & production 🧠🚀
- The SAM3 refactors and bug fixes make “encode once, query many times” safe and efficient, especially when changing text prompts or using multi-prompt batching.
- Shared backbone feature logic across SAM2/SAM3 reduces implementation divergence, which lowers the risk of subtle bugs and speeds up future maintenance.
-
Smoother out-of-the-box experience for SAM3 users 💡
- Auto-install prompts for tokenizer dependencies and robust MPS handling mean fewer cryptic crashes and less manual environment wrangling—especially on macOS laptops and new users trying semantic/text prompts.
- Consistent, public-facing import paths (
from ultralytics.models.sam import ...) make it easier to copy examples and reduce breakage if internals move.
-
Safer, more predictable model export across formats 📦
- Guarding FP16 and TFLite paths prevents surprising export-time failures, especially in mixed environments (CPU-only, different export formats, or older Torch).
- Clear version pins and runtime checks reduce the “works on my machine” problem when exporting to ONNX, MNN, or TensorFlow.
-
Higher CI signal quality & maintainability ⚙️
- Including SlowTests in the CI Summary and Slack alerts helps catch real regressions in long-path code (exports, hardware-specific flows) before they reach users.
- Docker and workflow updates keep the automation stack modern and lean.
-
Clearer documentation for a wide audience 📖✨
- Standardized wording, updated videos, and more honest model specs (e.g., SAM3’s large size vs YOLO11) give users more realistic expectations about performance, hardware needs, and when to choose which model.
- Better messaging around deprecated Explorer features and the role of Ultralytics HUB helps new users find the right tools faster.
What's Changed
- SAM3 Docs update by @glenn-jocher in #22932
- Minor docs improvements by @pderrenger in #22934
- Auto-install SAM3 tokenizer dependencies by @fcakyon in #22935
- Update import path for
SAM3VideoSemanticPredictorby @RizwanMunawar in #22945 - Add https://youtu.be/i34PacLIlq8 to docs by @RizwanMunawar in #22931
- Use FP32 ONNX model for TFLite export by @Y-T-G in #22949
- Fix TFLite export dtype mismatch with
half=Trueandnms=Trueby @glenn-jocher in #22946 - Minor fixes by @glenn-jocher in #22943
- Fix SAM3SemanticPredictor feature cache mutation when batching prompts by @NeilLint in #22948
- Bump actions/upload-artifact from 5 to 6 in /.github/workflows by @dependabot[bot] in #22952
- Refactor SAM backbone output and remove duplication by @Laughing-q in #22953
- Fix SAM3 inference with MPS by @Y-T-G in #22956
- Update SAM-3 docs and nav by @kayselmecnun in #22933
ultralytics 8.3.238Refactor SAM3 forward convolutions by @glenn-jocher in #22942
New Contributors
- @NeilLint made their first contribution in #22948
- @kayselmecnun made their first contribution in #22933
Full Changelog: v8.3.237...v8.3.238