github microsoft/DeepSpeed v0.10.3
v0.10.3: Patch release

latest releases: v0.15.3, v0.15.2, v0.15.1...
14 months ago

New Features

What's Changed

  • Add Mixed Precision ZeRO++ tutorial by @HeyangQin in #4241
  • DeepSpeed-Chat Llama2/stability release by @awan-10 in #4240
  • Update README.md by @awan-10 in #4244
  • Pin Triton version to >=2.0.0 and <2.1.0 by @lekurile in #4251
  • Allow modification of zero partitioned parameters by @tjruwase in #4192
  • Checks for user injection policy by @satpalsr in #3052
  • Add check that opening issues on CI failure requires schedule by @loadams in #4242
  • Code Refactoring by @tosemml in #4262
  • tolerating missing optimizer states for MoE [2nd attempt] by @clumsy in #4120
  • Fix nv-inference/un-pin transformers by @loadams in #4269
  • check for zero (empty) param groups in llama + hf/accelerate. by @awan-10 in #4270
  • use non_reentrant_checkpoint fix requires_grad of input must be true for activation checkpoint layer in pipeline train. by @inkcherry in #4224
  • The PostBackwardFunction class should be more clearly named to distinguish it from the PreBackwardFunction class. by @Crispig in #2548
  • fix iteration timing used in autotuning when gradient_accumulation_steps > 1 by @cli99 in #2888
  • Update README.md by @NinoRisteski in #4284
  • update deepspeed to run with the most recent triton 2.1.0 by @stephen-youn in #4278
  • Keep hpz secondary tensor in forward pass by @HeyangQin in #4288
  • Support iterators with incompletely defined len functions by @codedecde in #2445
  • AMD Kernel Compatibility Fixes by @cmikeh2 in #3180
  • ZeRO-Inference refresh by @tjruwase in #4197
  • fix user args parsing of string with spaces on runner by @YudiZh in #4265
  • Update index.md by @NinoRisteski in #4297

New Contributors

Full Changelog: v0.10.2...v0.10.3

Don't miss a new DeepSpeed release

NewReleases is sending notifications on new releases.