What's new in 1.0.1 (2024-11-29)
These are the changes in inference v1.0.1.
New features
- FEAT: Fish speech stream by @codingl2k1 in #2562
- FEAT: support sparse vector for bge-m3 by @pengjunfeng11 in #2540
- FEAT: whisper support for Mac MLX by @qinxuye in #2576
- FEAT: support guided decoding for vllm async engine by @wxiwnd in #2391
- FEAT: support QwQ-32B-Preview by @qinxuye in #2602
- FEAT: support glm-edge-chat model by @amumu96 in #2582
Enhancements
- ENH: Support fish speech reference audio by @codingl2k1 in #2542
Bug fixes
- BUG: GTE-qwen2 Embedding Dimension error by @cyhasuka in #2565
- BUG: request_limits does not work with streaming interfaces by @ChengjieLi28 in #2571
- BUG: Fix Codestral v0.1 URI for Pytorch Format by @danialcheung in #2590
- BUG: Correct the input bytes data by langchain_openai #2589 by @xiyuan-lee in #2600
Documentation
New Contributors
- @pengjunfeng11 made their first contribution in #2540
- @danialcheung made their first contribution in #2590
- @xiyuan-lee made their first contribution in #2600
Full Changelog: v1.0.0...v1.0.1