microsoft/DeepSpeed v0.10.3 on GitHub

New Features

ZeRO-Inference: 20X faster inference through weight quantization and KV cache offloading

What's Changed

Add Mixed Precision ZeRO++ tutorial by @HeyangQin in #4241
DeepSpeed-Chat Llama2/stability release by @awan-10 in #4240
Update README.md by @awan-10 in #4244
Pin Triton version to >=2.0.0 and <2.1.0 by @lekurile in #4251
Allow modification of zero partitioned parameters by @tjruwase in #4192
Checks for user injection policy by @satpalsr in #3052
Add check that opening issues on CI failure requires schedule by @loadams in #4242
Code Refactoring by @tosemml in #4262
tolerating missing optimizer states for MoE [2nd attempt] by @clumsy in #4120
Fix nv-inference/un-pin transformers by @loadams in #4269
check for zero (empty) param groups in llama + hf/accelerate. by @awan-10 in #4270
use non_reentrant_checkpoint fix requires_grad of input must be true for activation checkpoint layer in pipeline train. by @inkcherry in #4224
The PostBackwardFunction class should be more clearly named to distinguish it from the PreBackwardFunction class. by @Crispig in #2548
fix iteration timing used in autotuning when gradient_accumulation_steps > 1 by @cli99 in #2888
Update README.md by @NinoRisteski in #4284
update deepspeed to run with the most recent triton 2.1.0 by @stephen-youn in #4278
Keep hpz secondary tensor in forward pass by @HeyangQin in #4288
Support iterators with incompletely defined len functions by @codedecde in #2445
AMD Kernel Compatibility Fixes by @cmikeh2 in #3180
ZeRO-Inference refresh by @tjruwase in #4197
fix user args parsing of string with spaces on runner by @YudiZh in #4265
Update index.md by @NinoRisteski in #4297

New Contributors

@tosemml made their first contribution in #4262
@Crispig made their first contribution in #2548
@NinoRisteski made their first contribution in #4284
@codedecde made their first contribution in #2445
@YudiZh made their first contribution in #4265

Full Changelog: v0.10.2...v0.10.3

microsoft/DeepSpeed v0.10.3 v0.10.3: Patch release on GitHub

New Features

What's Changed

New Contributors

microsoft/DeepSpeed v0.10.3
v0.10.3: Patch release

on GitHub