Headline
- Support for large (sharded) GGUF models. Added
Llama-4-Scout-17B-16E-Instruct-GGUFto server models list @danielholanda.
Additional Improvements
- Overhauled the
pip installextras (no breaking change yet, old extras still work):dev,oga-cpu, andoga-hybridare the primary extras now. See https://lemonade-server.ai/install_options.html for details (@jeremyfowers). - Added
LLM Chatbutton to the system tray context menu, andModel Managementbutton now directly opens the model manager @jeremyfowers - OGA v0.8.2 is now used for CPU and CUDA backends @ramkrishna2910
- Hugging Face Hub
xetprotocol is enabled @jeremyfowers - Server testing improvements @jeremyfowers
- Add a developer setup button to the website by @jeremyfowers in #53
Fixes
- Improved compatibility with user-specified GGUF files @danielholanda
- Fix a crash on first server usage when installing from source @danielholanda
- Fixed
Phi-3.5-Mini-Instruct-Hybrid/CPUsupport @jeremyfowers - Removed unintentional "Running lemonade-server-dev..." printout @danielholanda
- Fixed UnicodeDecodeError that sometimes shows up for GGUF models @jeremyfowers
Full Changelog: v8.0.2...v8.0.3