What's new in 0.13.0 (2024-07-05)

These are the changes in inference v0.13.0.

New features

ENH: added gguf files for qwen2 by @qinxuye in #1745
ENH: Add more log modules by @ChengjieLi28 in #1771
ENH: Continuous batching supports vision model ability by @ChengjieLi28 in #1724
ENH: Add guard for model launching by @frostyplanet in #1680
BLD: Supports Aliyun docker image by @ChengjieLi28 in #1753
BLD: GPU docker use vllm image as base by @ChengjieLi28 in #1759
BLD: Pin llama-cpp-python to v0.2.77 in Docker for stability by @ChengjieLi28 in #1767

BUG: Fix glm4 tool call by @codingl2k1 in #1747
BUG: [UI] Fix authentication mode related bugs by @yiboyasss in #1772
BUG: Fix python client returns documents for rerank task by default by @ChengjieLi28 in #1780
BUG: Fix LLM based reranker may raise a TypeError by @codingl2k1 in #1794
BUG: fix deepseek-vl-chat by @qinxuye in #1795

Full Changelog: v0.12.3...v0.13.0