github LostRuins/koboldcpp v1.60.1
koboldcpp-1.60.1

latest releases: v1.75.2, v1.75.1, v1.75...
6 months ago

koboldcpp-1.60.1

KoboldCpp is just a 'Dirty Fork' edition

download

  • KoboldCpp now natively supports Local Image Generation, thanks to the phenomenal work done by @leejet in stable-diffusion.cpp! It provides an A1111 compatible txt2img endpoint which you can use within the embedded Kobold Lite, or in many other compatible frontends such as SillyTavern.
    • Just select a compatible SD1.5 or SDXL .safetensors fp16 model to load, either through the GUI launcher or with --sdconfig
    • Enjoy zero install, portable, lightweight and hassle free image generation directly from KoboldCpp, without installing multi-GBs worth of ComfyUi, A1111, Fooocus or others.
    • With just 8GB VRAM GPU, you can run both a 7B q4 GGUF (lowvram) alongside any SD1.5 image model at the same time, as a single instance, fully offloaded. If you run out of VRAM, select Compress Weights (quant) to quantize the image model to take less memory.
    • KoboldCpp allows you to run in text-gen-only, image-gen-only or hybrid modes, simply set the appropriate launcher configs.
    • Known to not work correctly in Vulkan (for now).
  • When running from command line, --contextsize can now be set to any arbitrary number in range instead of locked to fixed values. However, using a non-recommended value may result in incoherent output depending on your settings. The GUI launcher for this remains unchanged.
  • Added new quant types, pulled and merged improvements and fixes from upstream.
  • Fixed some issues loading older GGUFv1 models, they should be working again.
  • Added cloudflare tunnel support for macOS, (via --remotetunnel, however it probably won't work on M1, only amd64).
  • Updated API docs and Colab for image gen.
  • Updated Kobold Lite:
    • Integrated support for AllTalk TTS
    • Added "Auto Jailbreak" for instruct mode, useful to wrangle stubborn or censored models.
    • Auto enable image gen button if KCPP loads image model
    • Improved Autoscroll and layout, defaults to SSE streaming mode
    • Added option to import and export story via clipboard
    • Added option to set personal notes/comments in story

Update v1.60.1: Port fix for CVE-2024-21836 for GGUFv1, enabled LCM sampler, allowed loading gguf SD models, fix SD for metal.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Don't miss a new koboldcpp release

NewReleases is sending notifications on new releases.