π Summary
v8.3.181 focuses on safer, faster mixed-precision (FP16) support for SAM/SAM2 and YOLOE pipelines, plus reliability fixes across validation exports, source loading, and single-class training. π
π Key Changes
-
SAM FP16 support and dtype/device consistency (PR #21735, priority)
- Enables half-precision inference for SAM models with consistent dtype handling across blocks, encoders, decoders, utils, and predict paths.
- Avoids unnecessary float32 casts; ops now respect input tensor dtype (float16/bfloat16 where safe).
- Predictor now normalizes before casting and sets model dtype based on args.half; unified self.torch_dtype for prompts/masks/buffers.
-
SAM2 robustness without high_res_features (PR #21726)
- Decoder gracefully falls back when high-res features are absent; accepts tensor or dict feature inputs.
-
YOLOE device/half propagation and stability (PR #21670)
- Predict now forwards device/half flags; prompt tensors follow model precision; softmax casting simplified for consistency.
-
Validation export consistency across tasks (PR #21719)
- New scale_preds unifies scaling to original image sizes for detect, OBB, pose, and segment before saving JSON/TXT.
-
YOLOE visual prompt predictor switching fix (PR #21731)
- Predictor instance now correctly switches after initialization when using visual prompts.
-
Single-class training compatibility (PR #21725)
- Restores classes with single_cls by safely constraining max class index to 0 (no label mutation).
-
CSV source support for inference (PR #21729)
- Dataloaders now accept .csv source lists with whitespace-safe parsing.
-
Streamlit Live Inference improvements (PR #21553)
- Accepts multiple export formats (.pt, .onnx, .torchscript, .mlpackage, .engine, OpenVINO) and respects full paths provided by users.
-
YOLOE docs enhancements (PR #21728)
- Clearer fine-tuning, linear probing, and new export examples; minor classify docs correction.
π― Purpose & Impact
- Faster inference on modern GPUs β‘
- FP16 support for SAM reduces memory use and can speed up inference on compatible hardware.
- Greater numerical stability and fewer dtype/device surprises π‘οΈ
- Consistent dtype handling across SAM/SAM2 and YOLOE reduces precision mismatches and unintended casts.
- More robust segmentation and visual prompting workflows π§©
- SAM2 now works even without high-res features; YOLOE honors device/half flags and switches predictors reliably.
- Accurate and consistent validation exports across tasks π
- JSON/TXT outputs now consistently match original image sizes for detect/OBB/pose/segment.
- Easier deployment and input management π§°
- CSV sources work out of the box; Streamlit Live Inference loads multiple model formats seamlessly.
- Smoother single-class training β
- Prevents spurious βclass exceeds countβ errors, improving reliability for single-class projects.
What's Changed
- Fix
save_txt
coordinates scaling by @Y-T-G in #21719 - Add multiple export formats inference support in
Live Inference
solution by @RizwanMunawar in #21553 - Restore
classes
compatibility withsingle_cls
by @Y-T-G in #21725 - YOLOE: Fix visual prompt
predictor
not switching after initialization by @Y-T-G in #21731 - Fix inference with CSV source by @Y-T-G in #21729
- docs: improved documentation for yoloe export by @picsalex in #21728
- YOLOE: Fix
device
selection and supporthalf
inference for visual prompt by @RizwanMunawar in #21670 - Support inference on SAM models without
high_res_features
by @Laughing-q in #21726 ultralytics 8.3.181
Supporthalf
inference for SAM models by @Laughing-q in #21735
Full Changelog: v8.3.180...v8.3.181