github huggingface/transformers v5.0.0rc1
Release candidate 5.0.0rc1

latest release: v5.0.0rc2
pre-release2 days ago

What's Changed

This release candidate was focused mostly on quantization support with the new dynamic weight loader, and a few notable 🚨 breaking changes🚨:

  1. Default dtype for any model when using from_pretrained is now auto!
  1. Default shard size when saving a model is now 50GB:
  • 🚨🚨 [saving] Default to 50GB shards, and remove non-safe serialization by @Cyrilvallez in #42734
    This is now as fast as before thanks to xet, and is just more convenient on the hub.
  1. Kwargs. They are fundamental to enable integration with vllm and other toosl:

Dynamic weight loader updates:

Mostly QOL and fixed + support back CPU offloading.

  • mark params as _is_hf_initialized with DS Zero3 from weight conversion by @winglian in #42626
  • [loading] Allow loading to happen without threading by @Cyrilvallez in #42619
  • [loading] Correctly load params during offloading & careful memory considerations by @Cyrilvallez in #42632
  • allow registration of custom checkpoint conversion mappings by @winglian in #42634

New models:

Some notable quantization fixes:

Mostly added support for fbgemme , quanto,

Peft:

The dynamic weight loader broke small things, this adds glue for all models but MoEs.

Misc

Tokenization needed more refactoring, this time its a lot cleaner!

We omitted a lot of other commits for clarity, but thanks to everyone and the new contributors!

New Contributors

Full Changelog: v5.0.0rc0...v5.0.0rc1

Don't miss a new transformers release

NewReleases is sending notifications on new releases.