github huggingface/trl v0.8.0
v0.8.0: KTOTrainer, TRL CLIs, QLoRA + FSDP !

latest releases: v0.8.6, v0.8.5, v0.8.4...
7 months ago

New Trainer: KTOTrainer:

We recently introduced the KTOTrainer in order to run KTO algorithms on LLMs !

TRL Command Line Interfaces (CLIs):

Run SFT, DPO and chat with your aligned model directly from the terminal:

SFT:

trl sft --model_name_or_path facebook/opt-125m --dataset_name imdb --output_dir opt-sft-imdb

DPO:

trl dpo --model_name_or_path facebook/opt-125m --dataset_name trl-internal-testing/Anthropic-hh-rlhf-processed --output_dir opt-sft-hh-rlhf 

Chat:

trl chat --model_name_or_path Qwen/Qwen1.5-0.5B-Chat

Read more about CLI in the relevant documentation section or use --help for more details.

FSDP + QLoRA:

SFTTrainer now supports FSDP + QLoRA

  • Add support for FSDP+QLoRA and DeepSpeed ZeRO3+QLoRA by @pacman100 in #1416

Other fixes

New Contributors

Full Changelog: v0.7.11...v0.8.0

Don't miss a new trl release

NewReleases is sending notifications on new releases.