Patch release 1 - `SFTTrainer` enhancements and fixes

This patch release adds multiple fixes for the SFTTrainer and enhancements. Another patch release is coming for fixing an issue with PPOTrainer and Google Colab combined with wandb logging

What's Changed

Add slurm utility by @vwxyzjn in #412
Enable autotag feature w/ wandb by @vwxyzjn in #411
[doc build] Use secrets by @mishig25 in #420
Update test_reward_trainer.py by @younesbelkada in #421
best-of-n sampler class by @metric-space in #375
handle the offline case by @younesbelkada in #431
Fix correct gradient accumulation by @younesbelkada in #407
Drop support for Python 3.7 by @younesbelkada in #441
[SFTTrainer] Relax dataset constraints by @younesbelkada in #442
[SFTTrainer] Fix non packed dataset by @younesbelkada in #444
[core] Add stale bot by @younesbelkada in #447
[SFTTrainer] Introducing DataCollatorForCompletionOnlyLM by @younesbelkada in #445
[ConstantLengthDataset] Fix packed dataset issue by @younesbelkada in #452
Update accelerate arg passthrourgh for tensorboard logging to reflect logging_dir deprecation. by @jganitkevitch in #437
Multi adapter RL (MARL) - a single model for RM & Value Head by @younesbelkada in #373

New Contributors

@jganitkevitch made their first contribution in #437

Full Changelog: v0.4.4...v0.4.5

huggingface/trl v0.4.5 on GitHub

Patch release 1 - SFTTrainer enhancements and fixes

What's Changed

New Contributors

huggingface/trl v0.4.5
on GitHub

Patch release 1 - `SFTTrainer` enhancements and fixes