What's Changed
- Update version.txt after 0.18.4 release by @loadams in #7765
- Various fixes to run on mps by @jeffra in #7767
- Udpate workflow trigger by @tohtana in #7768
- fix: delete using namespace std. by @nathon-lee in #7766
- fix: update Megatron-DeepSpeed tutorial to match current repo structure by @nathon-lee in #7761
- Add timeout to test workflows by @tohtana in #7774
- Remove cron/PR triggers for outdated V100 tests by @loadams in #7777
- [Docs] Fix
docs/_pages/config-json.mdformat by @ooooo-create in #7779 - Update CLA to refer to DCO by @loadams in #7778
- Fix multiprocessing testcase by @k-artem in #7743
- fix: skip compressed allreduce for empty tensors by @T1mn in #7769
- docs: update README.md by @eltociear in #7781
- Fix gradient checkpointing with use_reentrant=True / PyTorch-style backward / ZeRO-3 by @tohtana in #7780
- Fix Ulysses PEFT test by @tohtana in #7784
- Fix Evoformer compilation by @sdvillal in #7760
- fix checkpointing/loading of z0+bf16 by @tohtana in #7786
- Add sequential allgather optimization for ZeRO-3 by @aeeeeeep in #7661
- Fix AutoTP test numerical tolerance with rtol by @tohtana in #7794
- Fix backward for pipeline engine by @tohtana in #7787
- Skip empty parameters in gradient reduction by @tohtana in #7789
- Fix issue with BF16 optimizer selection by @tohtana in #7788
- Fix BF16_Optimizer being used without ZeRO by @tohtana in #7790
- Add full test suite workflow by @tohtana in #7795
- Fix Muon optimizer module path by @tohtana in #7802
- Fix ping-pong buffer index reset and removing redundant stream sync by @undersilence in #7805
- Fix ZeRO stage to choose BF16 optimizer in test by @tohtana in #7803
- Run Evoformer tests sequentially by @tohtana in #7810
- Improve engine's cleanup by @tohtana in #7813
- Ignore evoformer test by @tohtana in #7815
- Fix typos in accelerator setup guide by @nathon-lee in #7818
- Raise clear error on in-place GatheredParameters edits without modifier_rank by @tohtana in #7817
- [Bugfix] Resolve Rank index out of range during BWD when sp_size < world_size in Ulysses by @Flink-ddd in #7809
- Update PyTorch to v2.9 for modal tests by @tohtana in #7816
New Contributors
- @ooooo-create made their first contribution in #7779
- @T1mn made their first contribution in #7769
- @sdvillal made their first contribution in #7760
- @undersilence made their first contribution in #7805
Full Changelog: v0.18.4...v0.18.5