v0.7.10: Minor fixes, Automatic templating, `setup_chat_format` API, stronger tests

This Patch release adds a new feature in TRL for dealing with chat datasets - you can load a directly formatted dataset without the need of formatting it beforehand.

The release also introduces a new API setup_chat_format to correctly resize the model embeddings with the target size when adding new tokens to comply with the chat format. Currently we only support chatml format and we can add more formats in the future

We also extensively test SFTTrainer and DPOTrainer and the example scripts, dpo.py and sft.py should be well -battletested. If you see any issue with the script, please let us know on GitHub.

What's Changed

set dev version by @younesbelkada in #1207
Check tokenize params on DPOTrainer by @pablovicente in #1197
Fix shape descriptions in calculate_loss method by @yuta0x89 in #1204
Fix FSDP error by @mgerstgrasser in #1196
Update Unsloth SFT, DPO docs by @danielhanchen in #1213
Fix args type by @zspo in #1214
[core / Docker] Add workflow to build TRL docker images by @younesbelkada in #1215
Refactor RewardConfig to own module by @lewtun in #1221
Add support for ChatML dataset format in by @philschmid in #1208
Add slow test workflow file by @younesbelkada in #1223
Remove a repeating line in how_to_train.md by @kykim0 in #1226
Logs metrics on all distributed processes when using DPO & FSDP by @AjayP13 in #1160
fix: improve error message when pad_token_id is not configured by @yumemio in #1152
[core / tests ] v1 slow tests by @younesbelkada in #1218
[core / SFTTrainer] Fix breaking change by @younesbelkada in #1229
Fixes slow tests by @younesbelkada in #1241
Fix weird doc bug by @younesbelkada in #1244
Add setup_chat_format for adding new special tokens to model for training chat models by @philschmid in #1242
Fix chatml template by @philschmid in #1248
fix: fix loss_type and some args desc by @zspo in #1247
Release: v0.7.10 by @younesbelkada in #1253

New Contributors

@yuta0x89 made their first contribution in #1204
@danielhanchen made their first contribution in #1213
@zspo made their first contribution in #1214
@philschmid made their first contribution in #1208
@kykim0 made their first contribution in #1226
@AjayP13 made their first contribution in #1160
@yumemio made their first contribution in #1152

Full Changelog: v0.7.9...v0.7.10

huggingface/trl v0.7.10 v0.7.10: Automatic templating, `setup_chat_format` API, stronger tests on GitHub

v0.7.10: Minor fixes, Automatic templating, setup_chat_format API, stronger tests

What's Changed

New Contributors

huggingface/trl v0.7.10
v0.7.10: Automatic templating, `setup_chat_format` API, stronger tests

on GitHub

v0.7.10: Minor fixes, Automatic templating, `setup_chat_format` API, stronger tests