New Bug Fixes
- Stable Diffusion now supported with latest Torch, diffusers, and Triton versions.
What's Changed
- Update version.txt after 0.12.2 release by @mrwyattii in #4617
- Fix figure in FlexGen blog by @tohtana in #4624
- Fix figure of llama2 13B in DS-FlexGen blog by @tohtana in #4625
- Fix config format by @xu-song in #4594
- Guanhua/partial offload rebase v2 (#590) by @GuanhuaWang in #4636
- offload++ blog (#623) by @GuanhuaWang in #4637
- Update README in offloadpp blog by @GuanhuaWang in #4641
- [docs] update news items by @jeffra in #4640
- DeepSpeed-FastGen Chinese Blog by @HeyangQin in #4642
- Fix issues with torch cpu builds by @loadams in #4639
- Isolate src code and testing for DeepSpeed-FastGen by @cmikeh2 in #4610
- Add Japanese blog for DeepSpeed-FastGen by @tohtana in #4651
- Fix for MII unit tests by @mrwyattii in #4652
- Enhance the robustness of
module_state_dict
by @LZHgrla in #4587 - Enable ZeRO3 allgather for multiple dtypes by @tohtana in #4647
- add option to disable pipeline partitioning by @nelyahu in #4322
- Added HIP_PLATFORM_AMD=1 for non JIT build by @rraminen in #4585
- Fix rope_theta arg for diffusers_attention by @lekurile in #4656
- tl.dot(a,b, trans_b=True) is not supported by triton2.0+ , updating this api by @bmedishe in #4541
- Update ds-chat workflow to work w/ deepspeed-chat install by @lekurile in #4598
- Diffusers attention script update triton2.1 by @bmedishe in #4573
- Fix the openfold training. by @cctry in #4657
- Universal ckp fixes by @mosheisland in #4588
- Update .gitignore [Adding comments , Improved documentation] by @Nadav23AnT in #4631
- Update lr_schedules.py by @CoinCheung in #4563
- Fix UNET and VAE implementations for new diffusers version by @lekurile in #4663
- fix num_kv_heads sharding in autoTP for the new in-repo Falcon-40B by @dc3671 in #4654
New Contributors
- @xu-song made their first contribution in #4594
- @LZHgrla made their first contribution in #4587
- @mosheisland made their first contribution in #4588
- @Nadav23AnT made their first contribution in #4631
- @CoinCheung made their first contribution in #4563
Full Changelog: v0.12.2...v0.12.3