github LostRuins/koboldcpp v1.40.1a
koboldcpp-1.40.1

latest releases: v1.74, v1.73.1, v1.73...
13 months ago

koboldcpp-1.40.1

This release is mostly for bugfixes to the previous one, but enough small stuff has changed that I chose to make it a new version instead of a patch for the previous one.

  • Fixed a regression in format detection for LLAMA 70B.
  • Converted the embedded horde worker into daemon mode, hopefully solves the occasional exceptions
  • Fixed some OOMs for blasbatchsize 2048, adjusted buffer sizes
  • Slight modification to the look ahead (2 to 5%) for the cuda pool malloc.
  • Pulled some bugfixes from upstream
  • Added a new field idle for the /api/extra/perf endpoint, allows checking if a generation is in progress without sending one.
  • Fixed cmake compilation for cudatoolkit 12.
  • Updated Lite, includes option for aesthetic instruct UI (early beta by @Lyrcaxis, please send them your feedback)

hotfix 1.40.1:

  • handle stablecode-completion-alpha-3b

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Don't miss a new koboldcpp release

NewReleases is sending notifications on new releases.