github huggingface/trl v0.6.0

latest releases: v0.8.6, v0.8.5, v0.8.4...
13 months ago

DDPO for diffusion models

We are excited to welcome the first RLHF + diffusion models algorithm to refine the generations from diffusion models.
Read more about it directly in the docs.

Before After DDPO finetuning

Bug fixes and other enhancements

The release also comes with multiple bug fixes reported and/or led by the community, check out the commit history below

What's Changed

New Contributors

Full Changelog: v0.5.0...v0.6.0

Don't miss a new trl release

NewReleases is sending notifications on new releases.