What's Changed
- MP ZeRO++ by @HeyangQin in #3954
- do allgather only in shared optimizer states groups by @inkcherry in #4167
- Permit empty environment variables as unset in
setup.py
by @loadams in #4185 - enable autoTP for mpt in huggingface model hub without trust_remote_c… by @sywangyi in #4062
- Fix nv-nightly workflow by @mrwyattii in #4163
- Fix the path in tutorial by @kytimmylai in #4193
- Add unit test to check HF low_cpu_mem_usage_flag by @loadams in #4184
- Fix ZeRO parameter initialization for tensors with
requires_grad=True
by @XuehaiPan in #4138 - DeepSpeed Ulysses tutorial by @minjiaz in #4200
- Load z3 checkpoints for inference by @tjruwase in #4171
- DeepSpeed Ulysses release by @samadejacobs in #4198
- Deepspeed-Ulysses blog by @samadejacobs in #4201
- Ds ulysses news by @samadejacobs in #4202
- DS-Ulysses formating by @samadejacobs in #4204
- Update Ulyssess by @samadejacobs in #4205
- Update README.md by @samadejacobs in #4211
- Add Japanese blog of DS-Ulysses by @tohtana in #4209
- DeepSpeed Ulysses Chinese blog translation by @HeyangQin in #4210
- add ulysses blog index by @conglongli in #4215
- Add MuP optimizers by @mrwyattii in #2043
- Simplify Gradient Attribute Names by @jomayeri in #4214
- add meta onDevice support for LLAMA2 by @dc3671 in #4147
- Fixes timer error referenced in #4212 by @bjoernpl in #4213
- Fix pipline dataloader when batch elements contain tuple by @ghosthamlet in #565
- feat(activation_checkpointing): add
non_reentrant_checkpoint
to support inputs require no grad by @hughpu in #4118 - add npu support dtypes by @CurryRice233 in #4223
- Fix fused qkv sizing for bloom by @molly-smith in #4161
- added port argument for ssh by @Hiromasa-H in #4117
- Empty tensor size check by @jomayeri in #4186
- fix: linker issues in conda environments #3929 by @maximegmd in #4235
- Enable AMD MI200 and H100 to run on branches for testing by @loadams in #4238
- fix MegatronLayerPolicy to be compatible with the newest ParallelTransformerLayer by @dc3671 in #4236
- Enable hpz when running with torch.no_grad by @HeyangQin in #4232
New Contributors
- @kytimmylai made their first contribution in #4193
- @bjoernpl made their first contribution in #4213
- @Hiromasa-H made their first contribution in #4117
- @maximegmd made their first contribution in #4235
Full Changelog: v0.10.1...v0.10.2