New Features
- Added support for GGUF models and llama.cpp backend to Lemonade Server (@jeremyfowers)
- Support for streaming tool calling in chat completions (@danielholanda)
- Open http://localhost:8000 in your browser, after starting Lemonade Server, to get a helpful web app (@jeremyfowers, @danielholanda). Includes:
- LLM chat with any installed model
- Model manager to install new models
- Links to documentation
Documentation
Complete Lemonade Server documentation overhaul hosted at https://lemonade-server.ai/docs/ (@vgodsoe)
Fixes
- Lock various dep versions to prevent bugs in recent dep releases (@danielholanda, @jeremyfowers)
- Lemonade Server users can use either
api/v0orapi/v1endpoints. They work the same, it's just that certain downstream apps expect one or the other. (@danielholanda)
Full Changelog: v7.0.0...v7.0.1