This is a patch release containing bug-fixes around parallel request support with llama.cpp models.
What's Changed
Bug fixes 🐛
- fix(llama.cpp): Enable parallel requests by @tauven in #1616
- fix(llama.cpp): enable cont batching when parallel is set by @mudler in #1622
Exciting New Features 🎉
👒 Dependencies
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #1623
Other Changes
- ⬆️ Update docs version mudler/LocalAI by @localai-bot in #1619
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #1620
- ⬆️ Update ggerganov/llama.cpp by @localai-bot in #1626
New Contributors
Full Changelog: v2.6.0...v2.6.1