Headline
- Many fixes and improvements, on top of the major improvements from v8.1.1:
- ROCm7 is now available as a llamacpp backend on supported Radeon GPUs @danielholanda
- Model build for custom NPU and Hybrid LLMs updated for RAI SW 1.5 @iswaryaalex
- Added
gpt-oss-120b-GGUFandgpt-oss-20b-GGUFsupport to Lemonade Server @danielholanda
What's Changed
- Add host paramater to allow external connections (Solves #85) by @ajh123 in #142
- Add models from the Deepseek R1, Llama3.2, Mistral, Phi3.5, ChatGLM3, and AMD OLMo family to NPU by @Looong01, @henrylearn2rock in #94
- Add lsdev entrypoint by @jeremyfowers in #116
Bug Fixes
- Fix backwards compatibility bug in model_manager.py by @jeremyfowers in #109
- Fix inference engine import by @jeremyfowers in #120
- Enable
wmion python embeddable distribution by @danielholanda in #140 - Fix
lemonade-server runon Linux ('OutputDuplicator' is not defined) by @danielholanda in #148 - Move OGA API examples to pre-built OGA model, and remove hf token from test by @jeremyfowers in #156
- Improve HIP ID detection mechanism by @danielholanda in #151
Documentation
- FAQ - New Questions by @vgodsoe in #138
- Some more faq.md edits by @jeremyfowers in #143
New Contributors
Full Changelog: v8.1.1...v8.1.2