What's new in 0.9.3 (2024-03-15)
These are the changes in inference v0.9.3.
New features
- FEAT: Add Yi-9B by @mujin2 in #1117
- FEAT: Provided the function of generate image. by @hainaweiben in #1047
Enhancements
- ENH: update cmd help info by @luweizheng in #1106
- ENH: Remove quantization limits for Apple METAL device when running model via
llama-cpp-python
by @ChengjieLi28 in #1134 - ENH: Make GET /v1/models compatible with OpenAI API. by @notsyncing in #1127
- ENH: support vllm>=0.3.1 by @qinxuye in #1145
Bug fixes
- BUG: fix the useless fstring. by @mikeshi80 in #1130
- BUG: Fixing the issue of model list loading failure caused by a large number of invalid requests on the model list page. by @wertycn in #1111
- BUG: Fix cache status for embedding, rerank and image models on the web UI by @ChengjieLi28 in #1135
- BUG: Fix missing information for
xinference registrations
andxinference list
command by @ChengjieLi28 in #1140 - BUG: Fix cannot continue to chat after canceling the streaming chat via
ctrl+c
by @ChengjieLi28 in #1144
Tests
- TST: Remove testing LLM model creating embedding by @ChengjieLi28 in #1121
Documentation
New Contributors
- @luweizheng made their first contribution in #1106
- @mujin2 made their first contribution in #1117
- @wertycn made their first contribution in #1111
Full Changelog: v0.9.2...v0.9.3