github LostRuins/koboldcpp v1.65
koboldcpp-1.65

latest releases: v1.77, v1.76, v1.75.2...
6 months ago

koboldcpp-1.65

at least we have a shovel edition

meme

  • NEW: Added a new standalone UI for Image Generation, thanks to @ayunami2000 for porting StableUI (original by @aqualxx) to KoboldCpp! Now you have a powerful dedicated A1111 compatible GUI for generating images locally, with a similar look and feel to Automatic1111. And it runs in your browser, launching straight from KoboldCpp, simply load a Stable Diffusion model and visit http://localhost:5001/sdui/
  • NEW: Added official CUDA 12 binaries. If you have a newer NVIDIA GPU and don't mind larger files, you may get increased speeds by using the CUDA 12 build koboldcpp_cuda12.exe
  • Added a new API field bypass_eos to skip EOS tokens while still allowing them to be generated.
  • Hopefully fixed tk window resizing issues
  • Increased interrogate mode token amount by 30%, and increased default chat completions token amount by 250%
  • Merged improvements and fixes from upstream
  • Updated Kobold Lite:
    • Added option to insert Instruct System Prompt
    • Added option to bypass (skip) EOS
    • Added toggle to return special tokens
    • Added Chat Names insertion for instruct mode
    • Added button to launch StableUI
    • Various minor fixes, support importing cards from CharacterHub urls.

Important Deprecation Notice: The flags --smartcontext, --hordeconfig and --sdconfig are being deprecated.

--smartcontext is no longer as useful nowadays with context shifting, and just adds clutter and confusion. With it's removal, if contextshift is enabled, smartcontext will be used as a fallback if contextshift is unavailable, such as with old models. --noshift can still be used to turn both behaviors off.

--hordeconfig and --sdconfig are being replaced, as the number of configurations for these arguments grow, the order of these positional arguments confuses people, and makes it very difficult to add new flags and toggles as well, since a misplaced new parameter breaks existing parameters. Additionally, it also prevented me from properly validating each input for data type and range.

As this is a large change, these deprecated flags will remain functional for now. However, you are strongly advised to switch over to the new replacement flags below:

Replacement Flags:

--hordemodelname  Sets your AI Horde display model name.
--hordeworkername Sets your AI Horde worker name.
--hordekey        Sets your AI Horde API key.
--hordemaxctx     Sets the maximum context length your worker will accept.
--hordegenlen     Sets the maximum number of tokens your worker will generate.

--sdmodel     Specify a stable diffusion model to enable image generation.
--sdthreads   Use a different number of threads for image generation if specified. 
--sdquant     If specified, loads the model quantized to save memory.
--sdclamped   If specified, limit generation steps and resolution settings for shared use.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster).
If you're using Linux, select the appropriate Linux binary file instead (not exe).
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Don't miss a new koboldcpp release

NewReleases is sending notifications on new releases.