This patch release covers bug fixes (#10347, #10349, #10348, #10352, #10363), keep compatibility for vLLMConfig
usage in out of tree models (#10356)
What's Changed
- Add default value to avoid Falcon crash (#5363) by @wchen61 in #10347
- [Misc] Fix import error in tensorizer tests and cleanup some code by @DarkLight1337 in #10349
- [Doc] Remove float32 choice from --lora-dtype by @xyang16 in #10348
- [Bugfix] Fix fully sharded LoRA bug by @jeejeelee in #10352
- [Misc] Fix some help info of arg_utils to improve readability by @ShangmingCai in #10362
- [core][misc] keep compatibility for old-style classes by @youkaichao in #10356
- [Bugfix] Ensure special tokens are properly filtered out for guided structured output with MistralTokenizer by @gcalmettes in #10363
- [Misc] Bump up test_fused_moe tolerance by @ElizaWszola in #10364
- [Misc] bump mistral common version by @simon-mo in #10367
New Contributors
Full Changelog: v0.6.4...v0.6.4.post1