koboldcpp-1.85

Now with 5% more kobo edition

NEW: Added Server-Sided (networked) save slots! You can now specify a database file when launching KoboldCpp using --savedatafile. Then, you will be able to save and load persistent stories over the network to that KoboldCpp server, and access it from any other browser or device connected to it over the network. This can also be combined with --password to require an API key to save/load the stories.
Added Top-N Sigma sampler (credit @EquinoxPsychosis). Note that this sampler can only be combined with Top-K, Temperature, and XTC.
Added --exportconfig, allowing users to export any set of launch arguments as a .kcpps config file from the command line. This file can also be used subsequently for model switching in admin mode.
Minor refactors for TFS and rep pen by @Reithan
Fixed .kcppt templates backend override not working
Updated clinfo binary for windows.
Updated Kobold Lite, multiple fixes and improvements
- Added improved thinking support, display and allow forced injecting <think> tokens in AI replies or filtering out old thoughts in subsequent generations.
- Reworked and improved load/save UI, added 2 extra local slots and 8 extra remote save slots.
- Top-N sigma support
- Added customization options for assistant jailbreak prompt
- Refactored 3rd party scenario loader (thanks @Desaroll)
Merged fixes and improvements from upstream (include vulkan and cuda enhancements and Granite support)

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you have an Nvidia GPU, but use an old CPU and koboldcpp.exe does not work, try koboldcpp_oldcpu.exe
If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster).
If you're using Linux, select the appropriate Linux binary file instead (not exe).
If you're on a modern MacOS (M1, M2, M3) you can try the koboldcpp-mac-arm64 MacOS binary.
If you're using AMD, we recommend trying the Vulkan option (available in all releases) first, for best support. Alternatively, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag. You can also refer to the readme and the wiki.

LostRuins/koboldcpp v1.85 koboldcpp-1.85 on GitHub

koboldcpp-1.85

LostRuins/koboldcpp v1.85
koboldcpp-1.85

on GitHub