koboldcpp-1.108

Try to fix broken pipe errors due to timeouts during long tool calls
Updated SDUI, added toggle to send img2img as a reference.
Added ollama /api/show endpoint emulation
Try to fix autofit on rocm going oom
Improved MCP behavior with multipart content
Prevent swapping config at runtime from changing the download directory
Adjust GUI for fractional scaling
Fix output filenames incorrect path in some cases
llama.cpp UI handling of common think tags.
--autofit mode now hides the GUI layers selector
Fixed extra spam from autofit mode
Autofit toggle is now in the Quick Launch menu
Autofit is now triggered if -1 gpulayers (default) is selected and tensor splits or tensor overrides are not set. Setting your own GPU layers overrides this behavior
Now allow Image Gen soft limit to be overridden to 2048x2048 if user chooses. Note that this may crash if you don't know what you're doing.
Updated upstream stable-diffusion.cpp by @wbruna
Updated Kobold Lite, multiple fixes and improvements
Merged fixes, new model support, and improvements from upstream

Download and run the koboldcpp.exe (Windows) or koboldcpp-linux-x64 (Linux), which is a one-file pyinstaller for NVIDIA GPU users.
If you have an older CPU or older NVIDIA GPU and koboldcpp does not work, try oldpc version instead (Cuda11 + AVX1).
If you don't have an NVIDIA GPU, or do not need CUDA, you can use the nocuda version which is smaller.
If you're using AMD, we recommend trying the Vulkan option in the nocuda build first, for best support. Alternatively, you can download our rolling ROCm binary here if you use Linux.
If you're on a modern MacOS (M-Series) you can use the koboldcpp-mac-arm64 MacOS binary.
Click here for .gguf conversion and quantization tools

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag. You can also refer to the readme and the wiki.

LostRuins/koboldcpp v1.108 koboldcpp-1.108 on GitHub

koboldcpp-1.108

LostRuins/koboldcpp v1.108
koboldcpp-1.108

on GitHub