What's Changed (this repo branch)
- Sync to v0.12.7
What's Changed (from Ollama)
New models
Qwen3-VL: Qwen3-VL is now available in all parameter sizes ranging from 2B to 235B
New API documentation
New API documentation is available for Ollama's API: https://docs.ollama.com/api
What's Changed
- Model load failures now include more information on Windows
- Fixed embedding results being incorrect when running embeddinggemma
- Fixed gemma3n on Vulkan backend
- Increased time allocated for ROCm to discover devices
- Fixed truncation error when generating embeddings
- Fixed request status code when running cloud models
- The OpenAI-compatible /v1/embeddings endpoint now supports encoding_format parameter
- Ollama will now parse tool calls that don't conform to {"name": name, "arguments": args} (thanks @rick-github!)
- Fixed prompt processing reporting in the llama runner
- Increase speed when scheduling models
- Fixed issue where FROM would not inherit RENDERER or PARSER commands
New Contributors
- @npardal made their first contribution in ollama#12715