What's Changed (this repo branch)

Sync to Ollama main v0.6.8

What's Changed (from Ollama)

Performance improvements for Qwen 3 MoE models (30b-a3b and 235b-a22b) on NVIDIA and AMD GPUs
Fixed GGML_ASSERT(tensor->op == GGML_OP_UNARY) failed issue caused by conflicting installations
Fixed a memory leak that occurred when providing images as input
ollama show will now correctly label older vision models such as llava
Reduced out of memory errors by improving worst-case memory estimations
Fix issue that resulted in a context canceled error

Full Changelog: v0.6.7...v0.6.8