koboldcpp-1.46.1

Important: Deprecation Notice for KoboldCpp 1.46

The following command line arguments are deprecated and have been removed from this version on.

--psutil_set_threads - parameter will be removed as it's now generally unhelpful, the defaults are usually sufficient.
--stream - a Kobold Lite only parameter, which is now a toggle saved inside Lite's settings and thus no longer necessary.
--unbantokens - EOS unbans should only be set via the generate API, in the use_default_badwordsids json field.
--usemirostat - Mirostat values should only be set via the generate API, in the mirostat mirostat_tau and mirostat_eta json fields.

Removed the original deprecated tkinter GUI, now only the new customtkinter GUI remains.
Improved embedded horde worker, added even more session stats, job pulls and job submits are now done in parallel so it should run about 20% faster for horde requests.
Changed the default model name from concedo/koboldcpp to koboldcpp/[model_filename]. This does prevent old "Kobold AI-Client" users from connecting via the API, so if you're still using that, either switch to a newer client or connect via the Basic/OpenAI API instead of the Kobold API.
Added proper API documentation, which can be found by navigating to /api or the web one at https://lite.koboldai.net/koboldcpp_api
Allow .kcpps files to be drag & dropped, as well as working via OpenWith in windows.
Added a new OpenAI Chat Completions compatible endpoint at /v1/chat/completions (credit: @teddybear082)
--onready processes are now started with subprocess.run instead of Popen (#462)
Both /check and /abort can now function together with multiuser mode, provided the correct genkey is used by the client (automatically handled in Lite).
Allow 64k --contextsize (for GGUF only, still 16k otherwise).
Minor UI fixes and enhancements.
Updated Lite, pulled fixes and improvements from upstream.

v1.46.1 hotfix: fixed an issue where blasthreads was used for values between 1 and 32 tokens.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

LostRuins/koboldcpp v1.46.1 koboldcpp-1.46.1 on GitHub

koboldcpp-1.46.1

Important: Deprecation Notice for KoboldCpp 1.46

LostRuins/koboldcpp v1.46.1
koboldcpp-1.46.1

on GitHub