koboldcpp-1.18
- This release brings a new feature within Kobold Lite - Group Conversations. In chat mode, you can now specify multiple Chat Opponents (delimited with
||$||
) which will trigger a simulated group chat, allowing the AI to reply as different people. Note that this does not work very well in Pygmalion models, as they were trained mainly on 1 to 1 chat. However it seems to work well in LLAMA based models. Each chat opponent will add a custom stopping sequence (max 10). Works best with Multiline Replies disabled. To demonstrate this, a new ScenarioClass Reunion
has been added in Kobold Lite. - Added a new flag
--highpriority
, which increases the CPU priority of the process, potentially speeding up generation timings. See #133 your mileage may vary depending on memory bottlenecks. Do share if you experience significant speedups. - Added the
--usemlock
parameter to keep model in RAM, for Apple M1 users. - Fixed a stop_sequence bug which caused a crash
- Added error information display when the tkinter GUI fails to load
- Pulled upstream changes and fixes.
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
Alternatively, drag and drop a compatible ggml model on top of the .exe, or run it and manually select the model in the popup dialog.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program with the --help
flag.