What's Changed
- Update DS-Chat docs for v0.9.0 by @mrwyattii in #3216
- Update DeepSpeed-Chat docs with latest changes to scripts by @mrwyattii in #3219
- Nested zero.Init() and dynamically defined model class by @tohtana in #2989
- Update torch version check in building sparse_attn by @loadams in #3152
- Fix for Stable Diffusion by @mrwyattii in #3218
- [update] reference in cifar-10 by @dtunai in #3212
- [fp16/doc] correct initial_scale_power default value by @stas00 in #3275
- update link to PL docs by @Borda in #3237
- fix typo in autotuner.py by @eltociear in #3269
- improving int4 asymmetric quantization accuracy by @HeyangQin in #3190
- Update install.sh by @digger-yu in #3270
- Fix cupy install version detection by @mrwyattii in #3276
- [ROCm] temporary workaround till __double2half support enabled in HIP by @bmedishe in #3236
- Fix pydantic and autodoc_pydantic version to <2.0.0 until support is added. by @loadams in #3290
- Add contribution images to readme by @digger-yu in #3282
- remove
torch.cuda.is_available()
check when compiling ops by @jinzhen-lin in #3085 - Update MI200 workflow to install apex with changes from pip by @loadams in #3294
- Add pre-compiling ops test by @loadams in #3277
- Update README.md by @digger-yu in #3315
- Update Dockerfile to use python 3.6 specifically by @bobowwb in #3298
- zero3 checkpoint frozen params by @tjruwase in #3205
- Fix for dist not being initialized when constructing main config by @mrwyattii in #3324
- Fix missing scale attributes for GPTJ by @cmikeh2 in #3256
- Explicitly check for OPT activation function by @cmikeh2 in #3278
New Contributors
- @dtunai made their first contribution in #3212
- @Borda made their first contribution in #3237
- @digger-yu made their first contribution in #3270
- @bmedishe made their first contribution in #3236
- @jinzhen-lin made their first contribution in #3085
- @bobowwb made their first contribution in #3298
Full Changelog: v0.9.0...v0.9.1