koboldcpp-1.55.1

Added Dynamic Temperature (DynaTemp), which is specified by a Temperature Value and a Temperature Range (Credits: @kalomaze). When used, the actual temperature is allowed to be automatically adjusted dynamically between DynaTemp ± DynaTempRange. For example, setting temperature=0.4 and dynatemp_range=0.1 will result in a minimum temp of 0.3 and max of 0.5. For ease of use, a UI to select min and max temperature for dynatemp directly is also provided in Lite. Both inputs will work and auto update the other.
Try to reuse cloudflared file when running remote tunnel, but also handle if cloudflared fails to download correctly.
Added a field to show the most recently used seed in the perf endpoint
Switched cuda pool malloc back to the old implementation
Updated Lite, added support for DynaTemp
Merged new improvements and fixes from upstream llama.cpp
Various minor fixes.

v1.55.1 - Trying to fix some cuda issues on Pascal cards. As I don't have a Pascal card I cannot verify - but try this if 1.55 didn't work.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

LostRuins/koboldcpp v1.55.1 koboldcpp-1.55.1 on GitHub

koboldcpp-1.55.1

LostRuins/koboldcpp v1.55.1
koboldcpp-1.55.1

on GitHub