github LostRuins/koboldcpp v1.50.1
koboldcpp-1.50.1

latest releases: v1.75.2, v1.75.1, v1.75...
10 months ago

koboldcpp-1.50.1

  • Improved automatic GPU layer selection: In the GUI launcher with CuBLAS, it will now automatically select all layers to do a full GPU offload if it thinks you have enough VRAM to support it.
  • Added a short delay to the Abort function in Lite, hopefully fixes the glitches with retry and abort.
  • Fixed automatic RoPE values for Yi and Deepseek. If no --ropeconfig is set, the preconfigured rope values in the model now take priority over the automatic context rope scale.
  • The above fix should also allow YaRN RoPE scaled models to work correctly by default, assuming the model has been correctly converted. Note: Customized YaRN configurations flags are not yet available.
  • The OpenAI compatible /v1/completions has been enhanced, adding extra unofficial parameters that Aphrodite uses, such as Min-P, Top-A and Mirostat. However, OpenAI does not support separate memory fields or sampler order, so the Kobold API will still give better results there.
  • SSE streaming support has been added for OpenAI /v1/completions endpoint (tested working in SillyTavern)
  • Custom DALL-E endpoints are now supported, for use with OAI proxies.
  • Pulled fixed and improvements from upstream, updated Kobold Lite

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Hotfix 1.50.1:

  • Fixed a regression with older RWKV/GPT-2/GPT-J/GPT-NeoX models that caused a segfault.
  • If ropeconfig is not set, apply auto linear rope scaling multiplier for rope-tuned models such as Yi when used outside their original context limit.
  • Fixed another bug in Lite with the retry/abort button.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Don't miss a new koboldcpp release

NewReleases is sending notifications on new releases.