What's new in 1.10.0 (2025-09-13)
These are the changes in inference v1.10.0.
New features
- FEAT: [model] Support Kokoro-82M-v1.1-zh by @JavisPeng in #4042
- FEAT: IP restriction by env: XINFERENCE_ALLOWED_IPS by @qxo in #4047
- FEAT: add support for the Anthropic API format by @OliverBryant in #4037
- FEAT: Openai API support vLLM json schema output by @OliverBryant in #4061
Enhancements
- ENH: Update the environment dependencies for GOT-OCR2 by @Gmgge in #4031
- ENH: Clean memory during running MLX version's LLM models by @OliverBryant in #4026
- BLD: bump funasr to 1.2.7 by @leslie2046 in #4039
- BLD: cu128 version Dockerfile fix by @zwt-1234 in #4056
- BLD: Update Dockerfile.cu128 by @amumu96 in #4059
- REF: refactor tool calls functionality by @amumu96 in #4025
Bug fixes
- BUG: Fix Kokoro-82M can't run on GPU by @OliverBryant in #4034
- BUG: [embeddings] fix parsing str type hf_overrides for vllm engine by @llyycchhee in #4052
- BUG: missing usage info in jina-embedding-v4 model response by @amumu96 in #4054
- BUG: distributed registration bug by @llyycchhee in #4046
New Contributors
- @JavisPeng made their first contribution in #4042
- @qxo made their first contribution in #4047
Full Changelog: v1.9.1...v1.10.0