What's new in 0.14.2 (2024-08-16)
These are the changes in inference v0.14.2.
New features
- FEAT: add gemma-2-it 2b & internlm2.5-chat 1.8b and 20b & update video and sglang docs by @qinxuye in #2080
- FEAT: support FP8 for vllm & sglang engine by @qinxuye in #2069
- Feat: Support internvl2 and internvl stream by @amumu96 in #2079
Enhancements
- ENH: make MiniCPM v2.6 support video by @Minamiyama in #2068
- REF: Remove some builtin old models and
ggmlv3
model format by @ChengjieLi28 in #2086
Bug fixes
- BUG: limit AutoAWQ version to fix docker issue by @qinxuye in #2067
- BUG: Fix custom glm4 & remove tool calls of ChatGLM3 by @codingl2k1 in #2081
- BUG: Infinited loop with login by @WalkerWang731 in #2039
Documentation
New Contributors
- @WalkerWang731 made their first contribution in #2039
Full Changelog: v0.14.1...v0.14.2