v0.7.9: Patch release for DPO & SFTTrainer
This is a patch release that fixes critical issues with SFTTrainer & DPOTrainer, together with minor fixes for PPOTrainer and DataCollatorForCompletionOnlyLM
What's Changed
- Release: v0.7.8 by @younesbelkada in #1200
- set dev version by @younesbelkada in #1201
- Fix instruction token masking by @mgerstgrasser in #1185
- Fix reported KL in PPO trainer by @mgerstgrasser in #1180
- [
DPOTrainer
] Fix peft + DPO + bf16 if one usesgenerate_during_eval
or pre-computed logits by @younesbelkada in #1203 - Revert "Address issue #1122" by @younesbelkada in #1205
- Release: v0.7.9 by @younesbelkada in #1206
Full Changelog: v0.7.8...v0.7.9