What's new in 0.11.2 (2024-05-24)
These are the changes in inference v0.11.2.
New features
- FEAT: Add command cal-model-mem by @frostyplanet in #1460
- FEAT: add deepseek llm and coder base by @qinxuye in #1533
- FEAT: add codeqwen1.5 by @qinxuye in #1535
- FEAT: Auto detect rerank type for unknown rerank type by @codingl2k1 in #1538
- FEAT: Provide the functionality to query information on various cached models hosted on the query node. by @hainaweiben in #1522
Enhancements
- ENH: Compatible with
huggingface-hub
v0.23.0
by @ChengjieLi28 in #1514 - ENH: convert command-r to chat by @qinxuye in #1537
- ENH: Support Intern-VL-Chat model by @amumu96 in #1536
- BLD: adapt to langchain 0.2.x, which has breaking changes by @mikeshi80 in #1521
- BLD: Fix pre commit by @frostyplanet in #1527
- BLD: compatible with torch 2.3.0 by @qinxuye in #1534
Bug fixes
- BUG: Fix start worker failed due to None device name by @codingl2k1 in #1539
- BUG: Fix gpu_idx allocate error when set replica > 1 by @amumu96 in #1528
Others
- CHORE: Basic benchmark/benchmark_rerank.py by @codingl2k1 in #1479
Full Changelog: v0.11.1...v0.11.2