🌟 Summary
CLIP/MobileCLIP tokenization is now safer by default with a new truncate option, preventing crashes on long prompts, alongside improved progress-bar readability, better docs search/chat UX, augmentation docs enhancements, and more reliable ARM CI. 🚀
📊 Key Changes
-
CLIP/MobileCLIP safety upgrade (priority)
-
Progress bar readability
-
Docs chat/search experience
- Introduced native Ultralytics Chat widget with theme sync, keyboard shortcut (Cmd/Ctrl + K), and a header search button. Thanks @glenn-jocher (PRs #22639, #22648, #22652).
-
Data augmentation docs polish
-
CI reliability
- Self-hosted Raspberry Pi runners cleaned before builds to reduce flakiness (PR #22654).
🎯 Purpose & Impact
-
More robust text models ✨
- Avoids RuntimeError on lengthy prompts by truncating safely by default.
- Enables strict mode (
truncate=False) for developers who need exact-length checks or debugging. - Small, backward-compatible API enhancement improving stability in CLIP/MobileCLIP workflows.
-
Clearer training feedback 📈
- Easier-to-read tqdm output for slow iterations helps users understand performance and progress, especially on large models or slower hardware.
-
Better docs experience 🔎
- Faster, unified Ultralytics Chat + Search in the docs with consistent theming and a simple keyboard shortcut improves discoverability and support.
-
More flexible augmentations 🧪
- Documented
augmentationsunlocks custom Albumentations transforms via Python API, enabling advanced experimentation across detect/segment/pose/obb tasks.
- Documented
-
Improved contributor reliability 🧹
- Cleaner ARM CI environments reduce flaky tests and speed up feedback loops—benefiting ongoing development.
Example usage (strict vs safe tokenization):
from ultralytics.nn.text_model import CLIP, MobileCLIPTS
clip_model = CLIP(size="ViT-B/32", device="cpu")
safe_tokens = clip_model.tokenize("a very long caption ...", truncate=True) # default, avoids errors
strict_tokens = clip_model.tokenize("a very long caption ...", truncate=False) # may raise if too long
mobileclip = MobileCLIPTS(device="cpu")
tokens = mobileclip.tokenize(["caption 1", "caption 2"]) # safely truncated by defaultWhat's Changed
- New Ultralytics
chat.jsbot by @glenn-jocher in #22639 - docs: 📝 add augmentations argument for custom Albumentations transforms in data augmentation macro table for documentation by @onuralpszr in #22615
- Fix typo in the augmentation guide by @picsalex in #22645
- Remove chat.js API URL From extra.js by @glenn-jocher in #22648
- Update mkdocs.yml to @latest for chat.js by @glenn-jocher in #22652
- Improve tqdm rate format readability for slow iterations by @fcakyon in #22660
- Clean up self-hosted GitHub runners before initial run by @lakshanthad in #22654
ultralytics 8.3.228Fix CLIP token truncation by @h13-0 in #22650
Full Changelog: v8.3.227...v8.3.228