What's Changed
- GatheredParameters - accept a tuple of params by @stas00 in #1941
- Update partition_parameters.py by @manuelciosici in #1943
- fix step in adam by @szhengac in #1823
- [pipe] prevent deadlock with multiple evals sequence by @stas00 in #1944
- Fairseq support by @jeffra in #1915
- DeepSpeed needs to start cleaning up by @tjruwase in #1947
- trivial fix by @kisseternity in #1954
- Enabling CUDA-graph for the bert-type models by @RezaYazdaniAminabadi in #1952
- Add loss scale guard to avoid inf loop by @Quentin-Anthony in #1958
- [launcher] add option to bypass ssh check by @liamcli in #1957
- Bump nokogiri from 1.13.4 to 1.13.6 in /docs by @dependabot in #1965
- Fix typo in timer.py by @Quentin-Anthony in #1964
- [docs] fix dependabot version issue by @jeffra in #1966
- Don't add curand on rocm by @jeffra in #1968
- Add Unidirectional Sparse Attention Type to BigBird and BSLongformer by @Quentin-Anthony in #1959
- Fix: Sparse tensors not updating by @Dipet in #1914
- Fixing several bugs in the inference-api and the kernels by @RezaYazdaniAminabadi in #1951
New Contributors
- @Quentin-Anthony made their first contribution in #1958
Full Changelog: v0.6.4...v0.6.5