pypi ultralytics 8.3.212
v8.3.212 - `ultralytics 8.3.212` Improve Trainer robustness to `save_dir` deletion (#22358)

latest release: 8.3.213
8 hours ago

🌟 Summary

Ultralytics 8.3.212 focuses on making training more robust and predictable by hardening the Trainer against edge cases (like deleted save directories and transient non-finite losses), while simplifying the optimizer step for modern PyTorch. πŸš€

πŸ“Š Key Changes

  • Trainer stability and I/O resilience (primary change in PR ultralytics 8.3.212 – Improve Trainer robustness to save_dir deletion by @glenn-jocher) βœ…
    • Always run backward and optimizer steps via AMP GradScaler; removed the non-finite loss guard to avoid silent skips. 🧠
    • Timed stopping remains unchanged and synchronized across DDP ranks. ⏱️
    • Safer metrics reading: read_results_csv() now returns {} on read failures instead of raising. πŸ“„βž‘οΈ{}
    • Safer saving: Trainer ensures directories exist before writing weights and metrics (e.g., best.pt, last.pt, results.csv). πŸ’ΎπŸ—‚οΈ
  • Optimizer step simplification (PR Revert optimizer_step() nan error changes by @glenn-jocher)
    • Removed legacy version branches and PyTorch 1.9 handling.
    • Unified behavior: unscale, clip gradients (clip_grad_norm_ with max_norm=10.0), scaler.step(), scaler.update(), zero grads. 🧹
  • Faster CI-only improvement (PR Use uv for docker.yml pip install tests by @glenn-jocher)
    • Docker test step now installs pytest with uv for faster, deterministic installs. ⚑🐳
  • Version bump to 8.3.212. πŸ”–

🎯 Purpose & Impact

  • More resilient training runs πŸ›‘οΈ
    • Prevents stalled or inconsistent training by avoiding silent skips when encountering transient NaNs/Infs; relies on GradScaler to safely handle invalid grads.
    • Reduces surprises in distributed training with unchanged, synchronized timed stopping.
  • Robust file handling on local or network storage πŸ“¦
    • Automatically re-creates missing directories for checkpoints and logsβ€”useful if save_dir is deleted mid-run or on flaky filesystems.
    • Avoids crashes when reading results.csv; failures return {} so training and tools can continue gracefully.
  • Modernized, cleaner codebase for PyTorch 2.x βœ…
    • Less legacy branching, simpler optimizer step logic, and consistent gradient clipping.
  • Faster, more reliable CI with no user-facing behavior change πŸ§ͺ
    • Speeds up internal testing using uv, improving development velocity without affecting end users.

Tip: If you programmatically consume training metrics, you can safely handle missing/locked CSVs:

from ultralytics.engine.trainer import BaseTrainer
trainer = BaseTrainer(args={})
metrics = trainer.read_results_csv()  # returns {} on failure instead of raising

What's Changed

Full Changelog: v8.3.211...v8.3.212

Don't miss a new ultralytics release

NewReleases is sending notifications on new releases.