github deepspeedai/DeepSpeed v0.18.9
v0.18.9 Patch Release

10 hours ago

What's Changed

  • Respect $TRITON_HOME by @Flamefire in #7907
  • Add Feature Universal Checkpoint for AutoTP by @nathon-lee in #7908
  • fix: remove unnecessary shell=True in ROCm GPU architecture detection by @instantraaamen in #7915
  • Don't detect local GPU if $DS_IGNORE_CUDA_DETECTION is set by @Flamefire in #7896
  • Add HuggingFace tp_plan support for AutoTP by @delock in #7901
  • fix: handle non-existent path in is_nfs_path for Triton autotune cache by @Krishnachaitanyakc in #7921
  • Fix backward compatibility of torch.amp.custom_fwd for PyTorch < 2.4 by @tohtana in #7920
  • Extending Muon Optimizer Support for ZeRO Stage 3 by @PKUWZP in #7919
  • Add news item for ASPLOS 2026 Best Paper Award by @PKUWZP in #7923
  • fix(superoffload) preserve multi-group updates with shared cpu buffers (#7905) by @xylian86 in #7906
  • AGENTS.md: Add pre-commit command to existing CI requirements line by @delock in #7930
  • Update README with latest news from DeepSpeed by @PKUWZP in #7931
  • Merging AutoSP into DeepSpeed by @neeldani in #7860
  • Add fallback to full test by @tohtana in #7933
  • Remove Microsoft Corporation copyright from AGENTS.md and CLAUDE.md by @PKUWZP in #7932
  • Update version.txt for latest incoming release 0.18.9 by @loadams in #7935

New Contributors

Full Changelog: v0.18.8...v0.18.9

Don't miss a new DeepSpeed release

NewReleases is sending notifications on new releases.