Patch release 1 - SFTTrainer
enhancements and fixes
This patch release adds multiple fixes for the SFTTrainer and enhancements. Another patch release is coming for fixing an issue with PPOTrainer and Google Colab combined with wandb logging
What's Changed
- Add slurm utility by @vwxyzjn in #412
- Enable autotag feature w/ wandb by @vwxyzjn in #411
- [doc build] Use secrets by @mishig25 in #420
- Update test_reward_trainer.py by @younesbelkada in #421
- best-of-n sampler class by @metric-space in #375
- handle the offline case by @younesbelkada in #431
- Fix correct gradient accumulation by @younesbelkada in #407
- Drop support for Python 3.7 by @younesbelkada in #441
- [
SFTTrainer
] Relax dataset constraints by @younesbelkada in #442 - [
SFTTrainer
] Fix non packed dataset by @younesbelkada in #444 - [
core
] Add stale bot by @younesbelkada in #447 - [
SFTTrainer
] IntroducingDataCollatorForCompletionOnlyLM
by @younesbelkada in #445 - [
ConstantLengthDataset
] Fix packed dataset issue by @younesbelkada in #452 - Update accelerate arg passthrourgh for tensorboard logging to reflect logging_dir deprecation. by @jganitkevitch in #437
- Multi adapter RL (MARL) - a single model for RM & Value Head by @younesbelkada in #373
New Contributors
- @jganitkevitch made their first contribution in #437
Full Changelog: v0.4.4...v0.4.5