What's new in 0.8.1 (2024-01-19)
These are the changes in inference v0.8.1.
New features
- FEAT: Auto recover limit by @codingl2k1 in #893
- FEAT: Prometheus metrics exporter by @codingl2k1 in #906
- FEAT: Add internlm2-chat support by @aresnow1 in #913
Enhancements
- ENH: Launch model asynchronously by @ChengjieLi28 in #879
- ENH: qwen vl modelscope by @codingl2k1 in #902
- ENH: Add "tools" in model ability by @aresnow1 in #904
- ENH: Add quantization support for qwen chat by @aresnow1 in #910
Bug fixes
- BUG: Fix prompt template of chatglm3-32k by @aresnow1 in #889
- BUG: invalid volumn in docker compose yml by @ChengjieLi28 in #890
- BUG: Revert #883 by @aresnow1 in #903
- BUG: Fix chatglm backend by @codingl2k1 in #898
- BUG: Fix tool calls on custom model by @codingl2k1 in #899
- BUG: Fix is_valid_model_name by @aresnow1 in #907
Documentation
- DOC: Update the documentation about use of docker by @aresnow1 in #901
- DOC:ADD FAQ IN troubleshooting.rst by @sisuad in #911
New Contributors
Full Changelog: v0.8.0...v0.8.1