What's Changed (this repo branch)

What's Changed (from Ollama)

qwen3-vl performance improvements, including flash attention support by default
qwen3-vl will now output less leading whitespace in the response when thinking
Fixed issue where deepseek-v3.1 thinking could not be disabled in Ollama's new app
Fixed issue where qwen3-vl would fail to interpret images with transparent backgrounds
Ollama will now stop running a model before removing it via ollama rm
Fixed issue where prompt processing would be slower on Ollama's engine
Ignore unsupported iGPUs when doing device discovery on Windows

New Contributors

Full Changelog: v0.12.7...v0.12.8