What's Changed
- Update version.txt after v0.17.4 release by @loadams in #7460
- Update README.md by @PKUWZP in #7465
- Add getter APIs for TP/PP/DP ranks in DeepSpeedEngine by @WoosungMyung in #7427
- fix issues raised by Coverity scans by @NirSonnenschein in #7431
- Fix all-gather duplicate params and wrong dtype by @eternalNight in #7462
- fix #7188 by @lpnpcs in #7371
- add --bind_cores_to_rank to zero offload tutorial by @delock in #7474
- Add blog for ZenFlow by @Antlera in #7463
- Fix cpu CI by @sfc-gh-truwase in #7481
- fix
deepspeed --venv_script
by @stas00 in #7469 - Modal CI by @sfc-gh-truwase in #7289
- [UlyssesSPDataLoaderAdapter] fix iterator reset by @stas00 in #7472
- [TiledFusedLogitsLoss] support inference by @stas00 in #7477
- Fix pre-compile on cpu-only machines by @AlongWY in #7168
- Enable forked PRs by @sfc-gh-truwase in #7486
- fix xpu device_id AttributeError issue by @yao-matrix in #7488
- Add Zenflow code for Stage 1 & 2 by @Antlera in #7391
- Fix invalid f-strings by @cyyever in #7457
- Fix DeepCompile for PyTorch v2.8 by @tohtana in #7496
- Reduce performance impact of compiler.enable decorator by @deepcharm in #7498
- Add index to HPU devices by @deepcharm in #7497
New Contributors
- @WoosungMyung made their first contribution in #7427
- @eternalNight made their first contribution in #7462
- @lpnpcs made their first contribution in #7371
- @Antlera made their first contribution in #7463
- @AlongWY made their first contribution in #7168
- @yao-matrix made their first contribution in #7488
- @cyyever made their first contribution in #7457
Full Changelog: v0.17.4...v0.17.5