What's Changed (this repo branch)
- Sync to v0.13.5
- Advise to disable flash attention on older AMD APUs in the README
What's Changed (from Ollama)
New Models
- Nemotron 3 Nano: A new Standard for Efficient, Open, and Intelligent Agentic Models
- Olmo 3 and Olmo 3.1: A series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.
- Google's FunctionGemma a specialized version of Google's Gemma 3 270M model fine-tuned explicitly for function calling.
What's Changed
- Enable Flash Attention automatically for models by default
- Fixed handling of long contexts with Gemma 3 models
- Fixed issue that would occur with Gemma 3 QAT models or other models imported with the Gemma 3 architecture
- bert architecture models now run on Ollama's engine
- Added built-in renderer & tool parsing capabilities for DeepSeek-V3.1
- Fixed issue where nested properties in tools may not have been rendered properly
New Contributors
- @familom made their first contribution in ollama#13220
- @nathannewyen made their first contribution in ollama#13469
Full Changelog: v0.13.3...v0.13.5