This patch release includes important fixes for the CLI and KTO & CPO trainers
What's Changed
- set dev version by @younesbelkada in #1529
- [CPO] fix memory leak due to retained value by @kashif in #1531
- VSFT hotfix - adds gen prompt to template and processor to hub by @edbeeching in #1532
- save_model -> save_pretrained in ppo_trainer.mdx by @ejmejm in #1537
- [KTO] support to load the adapter twice by @claralp in #1542
- CLI: Set
dataset_text_field
toNone
to allow ChatML automatic template by @younesbelkada in #1545 - FIX: Fix slow test by @younesbelkada in #1546
- Fixed ref model not used in PPO generation by @ejmejm in #1534
- Release: v0.8.4 by @younesbelkada in #1547
New Contributors
Full Changelog: v0.8.3...v0.8.4