trl 0.8.4 on Python PyPI

This patch release includes important fixes for the CLI and KTO & CPO trainers

What's Changed

set dev version by @younesbelkada in #1529
[CPO] fix memory leak due to retained value by @kashif in #1531
VSFT hotfix - adds gen prompt to template and processor to hub by @edbeeching in #1532
save_model -> save_pretrained in ppo_trainer.mdx by @ejmejm in #1537
[KTO] support to load the adapter twice by @claralp in #1542
CLI: Set dataset_text_field to None to allow ChatML automatic template by @younesbelkada in #1545
FIX: Fix slow test by @younesbelkada in #1546
Fixed ref model not used in PPO generation by @ejmejm in #1534
Release: v0.8.4 by @younesbelkada in #1547

Full Changelog: v0.8.3...v0.8.4