What's Changed
- Update version post-v0.19.0 release by @loadams in #7996
- Add office hours times/link on the README by @loadams in #8004
- Update topkgating probability-mask test expectation by @tohtana in #8007
- Optimize singleton MoE collectives by @Tianyi-Franklin-Wang in #7997
- zero3: SDMA allgather via mori (sdma_allgather) by @inkcherry in #7999
- fix(io): close aio_fd in FastFileWriter._fini to prevent fd leak by @jg-heo in #8005
- Auto-detect CUTLASS for EvoformerAttention by @MaxTretikov in #8000
- fix: use subprocess instead of os.system in data_analyzer.py by @orbisai0security in #7994
- Fix ZeRO-3 forward crash on modules with plain dict _parameters by @roycho96 in #8009
- Remove stale step() docstring from DeepSpeedCPUAdam by @lucaspirola in #8011
- Add configurable torch-latest dependency versions by @tohtana in #8016
- Run FastFileWriter fd-close test outside pytest-forked by @tohtana in #8015
- Make GitHub Actions job names unique by @tohtana in #8014
- Support bf16 optimizer states with CPU offload by @lucaspirola in #8010
- [fix] fix test_zf.py hang bug by @xbcReal in #8012
- [Blog] Muon Optimizer Support in DeepSpeed by @delock in #7962
- fix gemma4 num attention head bugs (from #7975) by @delock in #7990
- fix: add setup_context for torch.func compatibility by @roycho96 in #7916
- Sort and dedupe -gencode flags emitted by op_builder.builder by @adityasingh2400 in #8021
- fix(zero): enable vmap on LinearFunctionForZeroStage3 by @roycho96 in #8023
- Support flash-attn 2.7.0 in FPDT attention by @xbcReal in #8022
- Fix DeepCompile AOT kwargs patching for PyTorch >= v2.11 by @tohtana in #8024
New Contributors
- @Tianyi-Franklin-Wang made their first contribution in #7997
- @jg-heo made their first contribution in #8005
- @MaxTretikov made their first contribution in #8000
- @orbisai0security made their first contribution in #7994
- @lucaspirola made their first contribution in #8011
- @xbcReal made their first contribution in #8012
- @adityasingh2400 made their first contribution in #8021
Full Changelog: v0.19.0...v0.19.1