-
What's Changed (this repo branch)
Sync to Ollama main v0.6.3-rc0 -
What's Changed (from Ollama)
New sliding window attention optimizations for Gemma 3, improving inference speed and memory allocation for long context windows.
Improved loading speed of Gemma 3
ollama createwill now return the name of unsupported architectures
Fixederror talloc->buffer_id >= 0when running a model
Fixed(int)sched->hash_set.size >= graph->n_nodes + graph->n_leafserror when running a model
ollama createwill now automatically select the right template when importing Gemma 3 from safetensors
ollama show -vwill now correctly render boolean values astrueorfalseNew Contributors
@rylativity made their first contribution in ollama#9874
Full Changelog: v.0.6.2...v0.6.3-rc0