What's Changed
- fix(fp16): filter requires_grad in FP16 optimizer flat buffer init by @avicooper1 in #8029
- Run AutoSP compile tests sequentially by @tohtana in #8020
- Fix PR-target workflow concurrency groups by @tohtana in #8017
- Fix full CI test isolation for ZeRO chmod and NVMe quantization tests by @tohtana in #8008
- Keep required CI checks visible for ignored paths by @tohtana in #8019
- Bump version by @sfc-gh-truwase in #8030
- Add engine.coalesce_grad_reduction() for ZeRO 1/2/3 multi-backward by @roycho96 in #7992
- feat(zero): enable torch.func transforms on engine for ZeRO 0/1/2 by @roycho96 in #8026
- Simplify module_inject.transpose by @xbcReal in #8028
- Fix DeepCompile all-gather scheduler candidate selection by @tohtana in #8033
- Version fix to unblock pypi by @sfc-gh-truwase in #8039
- Bump version after 0.19.1 release by @tohtana in #8040
- Fix DeepCompile ZeRO-3 release parameter lifetime by @tohtana in #8032
- Fix ZenFlow ZeRO-3 selective optimizer crash with parameter offload on nvme by @Antlera in #8042
- Add test coverage for Muon muon_lr/adam_lr overrides by @sowndappan5 in #8047
- Avoid HF Hub access in CPU unit test setup by @tohtana in #8053
- Fix DeepCompile ZeRO-1 grad target lifetime by @tohtana in #8036
- Normalize ZeRO-3 DeepCompile grad dtype before reduction by @tohtana in #8038
- Remove AutoSP assertion against Transformers version by @tohtana in #8044
- fix(transformer): use correct stride in Transpose_Kernel shared memory indexing to eliminate bank conflicts by @flutist in #8055
- zero3: invalidate coordinator trace on hook re-registration by @roycho96 in #8043
- Consistent fp32 grads flow by @sfc-gh-truwase in #8056
- Add AutoEP by @tohtana in #7938
- Fix: ZenFlow Adam integration for updated PyTorch backward flow (#7759) by @Antlera in #7771
- Pass expected grad dtype to register_z3_param in ZeRO-3 release test by @tohtana in #8063
- Add Biren SUPA accelerator support by @frozenleaves in #8054
- Mixed-precision: per-policy param/buffer dtype cast (preserve fp32 buffers) by @sfc-gh-truwase in #8066
New Contributors
- @avicooper1 made their first contribution in #8029
- @sowndappan5 made their first contribution in #8047
- @flutist made their first contribution in #8055
- @frozenleaves made their first contribution in #8054
Full Changelog: v0.19.1...v0.19.2