What's Changed (this repo branch)

What's Changed (from Ollama)

Qwen3-VL: Qwen3-VL is now available in all parameter sizes ranging from 2B to 235B

New API documentation is available for Ollama's API: https://docs.ollama.com/api

Model load failures now include more information on Windows
Fixed embedding results being incorrect when running embeddinggemma
Fixed gemma3n on Vulkan backend
Increased time allocated for ROCm to discover devices
Fixed truncation error when generating embeddings
Fixed request status code when running cloud models
The OpenAI-compatible /v1/embeddings endpoint now supports encoding_format parameter
Ollama will now parse tool calls that don't conform to {"name": name, "arguments": args} (thanks @rick-github!)
Fixed prompt processing reporting in the llama runner
Increase speed when scheduling models
Fixed issue where FROM would not inherit RENDERER or PARSER commands

New Contributors