Release candidate v5.0.0rc3
New models:
- [GLM-4.7] GLM-Lite Supoort by @zRzRzRzRzRzRzR in #43031
- [GLM-Image] AR Model Support for GLM-Image by @zRzRzRzRzRzRzR in #43100
- Add LWDetr model by @sbucaille in #40991
- Add LightOnOCR model implementation by @baptiste-aubertin in #41621
What's Changed
We are getting closer and closer to the official release!
This RC is focused on removing more of the deprecated stuff, fixing some minors issues, doc updates.
- Update Japanese README to match English version by @lilin-1 in #43069
- [docs] Deploying by @stevhliu in #42263
- [docs] inference engines by @stevhliu in #42932
- Fix typos: Remove duplicate duplicate words words by @efeecllk in #43040
- [style] Rework ruff rules and update all files by @Cyrilvallez in #43144
- [CB] Minor fix in kwargs by @remi-or in #43147
- [Bug] qwen2_5_omni: cap generation length to be less than the max_position_embedding in DiT by @sniper35 in #43068
- Fix some deprecated practices in torch 2.9 by @Cyrilvallez in #43167
- Fix Fuyu processor width dimension bug in
_get_num_multimodal_tokensby @Abhinavexists in #43137 - Inherit from PreTrainedTokenizerBase by @juliendenize in #43143
- Generation config boolean defaults by @zucchini-nlp in #43000
- Fix failing
BartModelIntegrationTestby @Sai-Suraj-27 in #43160 - fix failure of llava/pixtral by @sywangyi in #42985
- GemmaTokenizer: remove redundant whitespace pre-tokenizer by @vaibhav-research in #43106
- Support
auto_doctringin Processors by @yonigozlan in #42101 - Fix failing
BitModelIntegrationTestby @Sai-Suraj-27 in #43164 - [
Fp8] Fix experts by @vasqu in #43154 - Docs: improve wording for documentation build instructions by @Sailnagale in #43007
- [makefile] Cleanup and improve the rules by @Cyrilvallez in #43171
- Some new models added stuff that was already removed by @Cyrilvallez in #43179
- Fixes and compilation warning in torchao docs by @merveenoyan in #42909
- [cache] Remove all deprecated classes by @Cyrilvallez in #43168
- Bump huggingface_hub minimal version by @Wauplin in #43188
- Rework check_config_attributes.py by @Cyrilvallez in #43191
- Fix generation config validation by @zucchini-nlp in #43175
- [style] Use 'x | y' syntax for processors as well by @Wauplin in #43189
- Remove deprecated objects by @Cyrilvallez in #43170
- fix chunked prefill implementation issue-43082 by @marcndo in #43132
- Reduce add_dates verbosity by @yonigozlan in #43184
- Add support for MiniMax-M2 by @rogeryoungh in #42028
- Fix failing
salesforce-ctrl,xlm&gpt-neomodel generation tests by @Sai-Suraj-27 in #43180 - Less verbose library helpers by @Cyrilvallez in #43197
- run all test files on CircleCI by @ydshieh in #43146
- Clamp temperature to >=1.0 for Dia generation by @Haseebasif7 in #43029
- Fix spelling typos in comments and code by @raimbekovm in #43046
- [docs] llama.cpp by @stevhliu in #43185
- [docs] gptq formatting fix by @victorywwong in #43216
- Grouped beam search from config params by @zucchini-nlp in #42472
- [
Generate] Allow custom config values in generate config by @vasqu in #43181 - Fix failing
Pix2StructIntegrationTestby @Sai-Suraj-27 in #43229 - Fix missing UTF-8 encoding in check_repo.py for Windows compatibility by @aarushisingh04 in #43123
- [Tokenizer] Change default value of return_dict to True in doc string for apply_chat_template by @kashif in #43223
- Fix failing
PhiIntegrationTestsby @Sai-Suraj-27 in #43214 - Use
HF_TOKENdirectly and removerequire_read_tokenby @ydshieh in #43233 - Fix failing
Owlv2ModelIntegrationTest&OwlViTModelIntegrationTestby @Sai-Suraj-27 in #43182 - Fix flashattn wrt quantized models by @SunMarc in #43145
- Remove unused imports by @cyyever in #43078
- Fix unsafe torch.load() in _load_rng_state allowing arbitrary code execution by @ColeMurray in #43140
- Reapply modular to examples by @Cyrilvallez in #43234
- More robust diff checks in
add_datesby @yonigozlan in #43199 - docs: fix grammatical error in README.md by @davidfertube in #43236
- Fix typo: seperately → separately in lw_detr converter by @skyvanguard in #43235
- Qwen-VL video processor accepts min/max pixels by @zucchini-nlp in #43228
- Deprecate dtype per sub config by @zucchini-nlp in #42990
- Remove more deprecated objects/args by @Cyrilvallez in #43195
- [CB] Soft-reset offloading by @remi-or in #43150
- Make benchmark-v2 to be device agnostic, to support more torch built-in devices like xpu by @yao-matrix in #43153
- Fix benchmark script by @Cyrilvallez in #43253
- Adding to run slow by @IlyasMoutawwakil in #43250
- Fix failing
Vip-llavamodel integration test by @Sai-Suraj-27 in #43252 - Remove deprecated and unused
position_idsin allapply_rotary_pos_embby @Cyrilvallez in #43255 - fix
_get_test_infointesting_utils.pyby @ydshieh in #43259 - Fix failing
Hiera,SwiftFormer&LEDModel integration tests by @Sai-Suraj-27 in #43225 - [style] Fix init isort and align makefile and CI by @Cyrilvallez in #43260
- [docs] tensorrt-llm by @stevhliu in #43176
- [consistency] Ensure models are added to the
_toctree.ymlby @Cyrilvallez in #43264 - Fix failing
PegasusX,Mvp&LEDmodel integration tests by @Sai-Suraj-27 in #43245 - [CB] Ensure parallel decoding test passes using FA by @remi-or in #43277
- fix crash in when running FSDP2+TP by @sywangyi in #43226
- [ci] Fixing some failing tests for important models by @Abdennacer-Badaoui in #43231
New Contributors
- @efeecllk made their first contribution in #43040
- @sniper35 made their first contribution in #43068
- @Abhinavexists made their first contribution in #43137
- @vaibhav-research made their first contribution in #43106
- @Sailnagale made their first contribution in #43007
- @rogeryoungh made their first contribution in #42028
- @Haseebasif7 made their first contribution in #43029
- @victorywwong made their first contribution in #43216
- @aarushisingh04 made their first contribution in #43123
- @ColeMurray made their first contribution in #43140
- @davidfertube made their first contribution in #43236
- @skyvanguard made their first contribution in #43235
- @baptiste-aubertin made their first contribution in #41621
Full Changelog: v5.0.0rc2...v5.0.0rc3