New Features
What's Changed
- Add Mixed Precision ZeRO++ tutorial by @HeyangQin in #4241
- DeepSpeed-Chat Llama2/stability release by @awan-10 in #4240
- Update README.md by @awan-10 in #4244
- Pin Triton version to >=2.0.0 and <2.1.0 by @lekurile in #4251
- Allow modification of zero partitioned parameters by @tjruwase in #4192
- Checks for user injection policy by @satpalsr in #3052
- Add check that opening issues on CI failure requires schedule by @loadams in #4242
- Code Refactoring by @tosemml in #4262
- tolerating missing optimizer states for MoE [2nd attempt] by @clumsy in #4120
- Fix nv-inference/un-pin transformers by @loadams in #4269
- check for zero (empty) param groups in llama + hf/accelerate. by @awan-10 in #4270
- use
non_reentrant_checkpoint
fix requires_grad of input must be true for activation checkpoint layer in pipeline train. by @inkcherry in #4224 - The PostBackwardFunction class should be more clearly named to distinguish it from the PreBackwardFunction class. by @Crispig in #2548
- fix iteration timing used in autotuning when gradient_accumulation_steps > 1 by @cli99 in #2888
- Update README.md by @NinoRisteski in #4284
- update deepspeed to run with the most recent triton 2.1.0 by @stephen-youn in #4278
- Keep hpz secondary tensor in forward pass by @HeyangQin in #4288
- Support iterators with incompletely defined len functions by @codedecde in #2445
- AMD Kernel Compatibility Fixes by @cmikeh2 in #3180
- ZeRO-Inference refresh by @tjruwase in #4197
- fix user args parsing of string with spaces on runner by @YudiZh in #4265
- Update index.md by @NinoRisteski in #4297
New Contributors
- @tosemml made their first contribution in #4262
- @Crispig made their first contribution in #2548
- @NinoRisteski made their first contribution in #4284
- @codedecde made their first contribution in #2445
- @YudiZh made their first contribution in #4265
Full Changelog: v0.10.2...v0.10.3