Headline
- Updated llamacpp Vulkan, ROCm, and Metal builds to support the latest models
- Added Qwen3-VL, FLM2-MoE, and Granite 4.0 MoE models to Lemonade's model manager
- Lots of fixes and improvements under the hood from people's feedback
What's Changed
- Add project roadmap to README by @jeremyfowers in #583
- Infinite timeout for inference requests by @jeremyfowers in #590
- Exclude zstd from deb package and organize CMakeLists.txt by @VladimirVLF in #587
- Update README to change default host address by @jeremyfowers in #588
- Update the models list doc by @jeremyfowers in #591
- Fix failing tests by @jeremyfowers in #598
- Fixes: RAI detection/driver, list, smoother startup by @jeremyfowers in #599
- Remove GAIA UI recommendation from Open WebUI documentation by @jeremyfowers in #610
- Add an FAQ about HF_HOME by @jeremyfowers in #614
- Quiet! No more entering health and models endpoint by @jeremyfowers in #613
- Update Llama.cpp Version on All Platforms by @danielholanda in #611
- Add new SOTA models: Qwen3-vl, LFM2-MoE, Granite-MoE by @jeremyfowers in #609
New Contributors
- @VladimirVLF made their first contribution in #587
Full Changelog: v9.0.3...v9.0.4