koboldcpp-1.60.1
KoboldCpp is just a 'Dirty Fork' edition
- KoboldCpp now natively supports Local Image Generation, thanks to the phenomenal work done by @leejet in stable-diffusion.cpp! It provides an A1111 compatible
txt2img
endpoint which you can use within the embedded Kobold Lite, or in many other compatible frontends such as SillyTavern.- Just select a compatible SD1.5 or SDXL
.safetensors
fp16 model to load, either through the GUI launcher or with--sdconfig
- Enjoy zero install, portable, lightweight and hassle free image generation directly from KoboldCpp, without installing multi-GBs worth of ComfyUi, A1111, Fooocus or others.
- With just 8GB VRAM GPU, you can run both a 7B q4 GGUF (lowvram) alongside any SD1.5 image model at the same time, as a single instance, fully offloaded. If you run out of VRAM, select
Compress Weights (quant)
to quantize the image model to take less memory. - KoboldCpp allows you to run in text-gen-only, image-gen-only or hybrid modes, simply set the appropriate launcher configs.
- Known to not work correctly in Vulkan (for now).
- Just select a compatible SD1.5 or SDXL
- When running from command line,
--contextsize
can now be set to any arbitrary number in range instead of locked to fixed values. However, using a non-recommended value may result in incoherent output depending on your settings. The GUI launcher for this remains unchanged. - Added new quant types, pulled and merged improvements and fixes from upstream.
- Fixed some issues loading older GGUFv1 models, they should be working again.
- Added cloudflare tunnel support for macOS, (via
--remotetunnel
, however it probably won't work on M1, only amd64). - Updated API docs and Colab for image gen.
- Updated Kobold Lite:
- Integrated support for AllTalk TTS
- Added "Auto Jailbreak" for instruct mode, useful to wrangle stubborn or censored models.
- Auto enable image gen button if KCPP loads image model
- Improved Autoscroll and layout, defaults to SSE streaming mode
- Added option to import and export story via clipboard
- Added option to set personal notes/comments in story
Update v1.60.1: Port fix for CVE-2024-21836 for GGUFv1, enabled LCM sampler, allowed loading gguf SD models, fix SD for metal.
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here
Run it from the command line with the desired launch parameters (see --help
), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help
flag.