What's new in 0.9.1 (2024-03-01)
These are the changes in inference v0.9.1.
New features
- FEAT: Docker for cpu only by @ChengjieLi28 in #1068
Enhancements
- ENH: Support downloading gemma from modelscope by @aresnow1 in #1035
- ENH: [UI] Setting
quantization
when registering LLM by @ChengjieLi28 in #1040 - ENH: Restful client supports multiple system prompts for chat by @ChengjieLi28 in #1056
- ENH: supports disabling worker reporting status by @ChengjieLi28 in #1057
- ENH: Extra params for
xinference launch
command line by @ChengjieLi28 in #1048
Bug fixes
- BUG: Fix some models that cannot download from
modelscope
by @ChengjieLi28 in #1066 - BUG: Fix early truncation due to
max_token
being default to16
instead of1024
by @ZhangTianrong in #1061
Documentation
- DOC: Update readme by @qinxuye in #1045
- DOC: Fix readme by @qinxuye in #1054
- DOC: Fix wechat links by @qinxuye in #1055
New Contributors
- @ZhangTianrong made their first contribution in #1061
Full Changelog: v0.9.0...v0.9.1