What's new in 0.10.1 (2024-04-12)
These are the changes in inference v0.10.1.
New features
- FEAT: add support for qwen1.5 32B chat model by @mikeshi80 in #1249
- FEAT: Support Qwen MoE model for huggingface and modelscope by @xiaodouzi666 in #1263
- FEAT: Enable streaming in tool calls for Qwen when using vllm by @zhanghx0905 in #1215
Enhancements
- ENH: make function create_embedding could receive extra args by @amumu96 in #1224
- ENH: support more GPTQ and AWQ format for some models by @xiaodouzi666 in #1243
- ENH: support multi gpus for qwen-vl and yi-vl by @qinxuye in #1236
- ENH: support llamacpp multiple gpu by @amumu96 in #1229
- ENH: UI: paper material for cards by @Minamiyama in #1261
- REF: Refactor launch model for Web UI by @yiboyasss in #1254
- REF: Remove ctransformers supports by @mujin2 in #1267
Bug fixes
- BUG: Fix docker cpu build by @ChengjieLi28 in #1213
- BUG: Fix cannot start xinference in docker due to
cv2
by @ChengjieLi28 in #1217 - BUG: Cannot start xinference in docker by @ChengjieLi28 in #1219
- BUG: Fix
opencv
issue in docker container by @ChengjieLi28 in #1227 - BUG: Fix the launch bug of OmnilMM 12B. by @hainaweiben in #1241
- BUG: style spell error by @Minamiyama in #1247
- BUG: Fix issue with supervisor not clearing information after worker exit by @hainaweiben in #1231
- BUG: custom models on the web ui by @yiboyasss in #1259
- BUG: fix system prompts for chatglm3 and internlm2 pytorch by @qinxuye in #1271
- BUG: Fix authority and jump issue by @yiboyasss in #1276
- BUG: fix custom vision model by @qinxuye in #1280
Tests
- TST: Fix tests due to
llama-cpp-python
v0.2.58
by @ChengjieLi28 in #1242
Documentation
- DOC: auto gen vllm doc & add chatglm3-{32k, 128k} support for vllm by @qinxuye in #1234
- DOC: update models doc by @qinxuye in #1246
- DOC: update readme by @qinxuye in #1268
New Contributors
- @amumu96 made their first contribution in #1224
- @xiaodouzi666 made their first contribution in #1243
- @yiboyasss made their first contribution in #1254
Full Changelog: v0.10.0...v0.10.1