koboldcpp-1.50.1

Improved automatic GPU layer selection: In the GUI launcher with CuBLAS, it will now automatically select all layers to do a full GPU offload if it thinks you have enough VRAM to support it.
Added a short delay to the Abort function in Lite, hopefully fixes the glitches with retry and abort.
Fixed automatic RoPE values for Yi and Deepseek. If no --ropeconfig is set, the preconfigured rope values in the model now take priority over the automatic context rope scale.
The above fix should also allow YaRN RoPE scaled models to work correctly by default, assuming the model has been correctly converted. Note: Customized YaRN configurations flags are not yet available.
The OpenAI compatible /v1/completions has been enhanced, adding extra unofficial parameters that Aphrodite uses, such as Min-P, Top-A and Mirostat. However, OpenAI does not support separate memory fields or sampler order, so the Kobold API will still give better results there.
SSE streaming support has been added for OpenAI /v1/completions endpoint (tested working in SillyTavern)
Custom DALL-E endpoints are now supported, for use with OAI proxies.
Pulled fixed and improvements from upstream, updated Kobold Lite

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Hotfix 1.50.1:

Fixed a regression with older RWKV/GPT-2/GPT-J/GPT-NeoX models that caused a segfault.
If ropeconfig is not set, apply auto linear rope scaling multiplier for rope-tuned models such as Yi when used outside their original context limit.
Fixed another bug in Lite with the retry/abort button.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

LostRuins/koboldcpp v1.50.1 koboldcpp-1.50.1 on GitHub

koboldcpp-1.50.1

LostRuins/koboldcpp v1.50.1
koboldcpp-1.50.1

on GitHub