pypi ultralytics 8.4.62
v8.4.62 - Prevent NaN/Inf EMA from discarding training checkpoints (#24731)

5 hours ago

๐ŸŒŸ Summary

๐Ÿ›ก๏ธ v8.4.62 is mainly a reliability release focused on preventing trained models from being lost at the end of training, with additional improvements to Platform docs, dataset/API documentation, test stability, and CI efficiency.

๐Ÿ“Š Key Changes

  • ๐Ÿšจ Major training fix: checkpoints are no longer discarded just because EMA hits NaN/Inf during save checks

    • The most important change in this release, from PR #24731 by @glenn-jocher, fixes a bug where good models could finish training successfully but still fail to save any checkpoint.
    • This especially affected some runs using AdamW + AMP, where validation could corrupt the live EMA weights and cause repeated warnings like โ€œSkipping checkpoint save... EMA contains NaN/Infโ€.
    • The fix now:
      • keeps validation from modifying the live EMA in place
      • checks finiteness on the original fp32 EMA, not an already-converted fp16 copy
      • safely clamps overflow during checkpoint serialization instead of skipping the save
  • โœ… Validation is now safer during AMP training

    • Validation still benefits from mixed precision speedups, but it no longer permanently โ€œpoisonsโ€ the EMA model.
    • This prevents a failure mode where one bad validation step could block checkpoint saving for the rest of training.
  • ๐Ÿงช New test coverage for fp16 overflow checkpoint handling

    • Added tests to ensure models with large-but-finite EMA weights are still saved correctly.
    • This helps protect against regressions in future releases.
  • ๐Ÿ“˜ Big Ultralytics Platform docs refresh

    • PR #24726 by @glenn-jocher significantly improved accuracy across Platform docs.
    • Updates include:
      • corrected UI labels and workflows
      • expanded Platform API reference
      • clearer dataset, annotation, deployment, training, billing, teams, and integrations docs
      • newly documented API capabilities like dataset embeddings, class management, GPU availability, import flows, and more
  • ๐Ÿ”— Fixed broken COCO evaluation links

  • ๐Ÿงช Less flaky data-related tests

    • PR #24724 reduces unnecessary downloads in tests and reuses cached assets when possible.
    • This should make CI more dependable and faster.
  • โšก Lean CI improvements

    • PR #24725 reduces git clone size and speeds up docs publishing and some test workflows.
    • PR #24722 updates Codecov GitHub Actions from v6 to v7.

๐ŸŽฏ Purpose & Impact

  • ๐Ÿ’พ Prevents losing trained models

    • The headline fix is very important for users training YOLO models locally or in automated pipelines.
    • If your run trained well but ended with โ€œno checkpoint was saved,โ€ this release directly addresses that issue.
  • ๐Ÿ”’ Improves training stability and trustworthiness

    • Users can have more confidence that successful training runs will actually produce saved checkpoints, especially when using AMP for faster training.
  • ๐Ÿš€ Better experience for common training setups

    • This is especially impactful for users training with AdamW + AMP, where the bug had been widely reported.
    • In practical terms: fewer surprise failures, less wasted compute, and less need for workarounds like disabling AMP.
  • ๐Ÿ“š More accurate docs for the Ultralytics Platform

    • Platform users should now find the docs easier to follow and more aligned with what they actually see in the app.
    • This lowers confusion for both new and advanced users working with datasets, training, deployment, billing, and APIs.
  • ๐Ÿงฐ Improved developer and CI reliability

    • Faster, lighter CI and more stable tests help maintain release quality and reduce false failures behind the scenes.
  • ๐ŸŒ Cleaner external documentation links

    • Broken COCO benchmark links are fixed, making it easier for users to find the right evaluation submission path.

Overall, v8.4.62 is not a major model-feature release, but it is a high-value stability update ๐Ÿ› ๏ธโ€”especially for anyone training YOLO models with mixed precision and expecting reliable checkpoint saves.

What's Changed

Full Changelog: v8.4.61...v8.4.62

Don't miss a new ultralytics release

NewReleases is sending notifications on new releases.