What's New
- Model Discovery: Discover new LLMs from HuggingFace, right from GPT4All! (83c76be)
- Support GPU offload of Gemma's output tensor (#1997)
- Enable Kompute support for 10 more model architectures (#2005)
- These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM, MiniCPM, Orion, Qwen, and StarCoder.
- Expose min_p sampling parameter of llama.cpp by @chrisbarrera in #2014
- Default to a blank line between reply and next prompt for templates without
%2
(#1996) - Add Nous-Hermes-2-Mistral-7B-DPO to official models list by @ThiloteE in #2027
Fixes
- Fix compilation warnings on macOS (e7f2ff1)
- Fix crash when ChatGPT API key is set, and hide non-ChatGPT settings properly (#2003)
- Fix crash when adding/removing a clone - a regression in v2.7.1 (#2031)
- Fix layer norm epsilon value in BERT model (#1946)
- Fix clones being created with the wrong number of GPU layers (#2011)
New Contributors
- @TareHimself made their first contribution in #1897
Full Changelog: v2.7.1...v2.7.2