koboldcpp-1.106

MCP for the masses edition

NEW: MCP Server and Client Support Added to KoboldCpp - KoboldCpp now supports running an MCP bridge that serves as a direct drop-in replacement for Claude Desktop.
- KoboldCpp can connect to any HTTP or STDIO MCP server, using a mcp.json config format compatible with Claude Desktop.
- Multiple servers are supported, KoboldCpp will automatically combine their tools and dispatch request appropriately.
- Recommended guide for MCP newbies: Here is a simple guide on running a Filesystem MCP Server to let your AI browse files locally on your PC and search the web - https://github.com/LostRuins/koboldcpp/wiki#mcp-tool-calling
- CAUTION: Running ANY MCP SERVER gives it full access to your system. Their 3rd party scripts will be able to modify and make changes to your files. Be sure to only run servers you trust!
- The example music playing MCP server used in the screenshot above was this audio-player-mcp
Flash Attention is now enabled by default when using the GUI launcher.
Improvements to tool parsing (thanks @AdamJ8)
API field continue_assistant_turn is now enabled by default in all chat completions (assistant prefill)
Interrogate image max length increased
Various StableUI fixes by @Riztard
Using the environment variable GGML_VK_VISIBLE_DEVICES externally now always overrides whatever vulkan device settings set from KoboldCpp.
Updated Kobold Lite, multiple fixes and improvements
- NEW: Full settings UI overhaul from @Rose22, the settings menu is now much cleaner and more organized. Feedback welcome!
- NEW: Added 4 new OLED themes from @Rose22
- Improved performance when editing massive texts
- General cleanup and multiple minor adjustments
Merged fixes, model support, and improvements from upstream

Important Notice: The CLBlast backend may be removed soon, as it is very outdated and no longer receives and updates, fixes or improvements. It can be considered superceded by the Vulkan backend. If you have concerns, please join the discussion here.

Download and run the koboldcpp.exe (Windows) or koboldcpp-linux-x64 (Linux), which is a one-file pyinstaller for NVIDIA GPU users.
If you have an older CPU or older NVIDIA GPU and koboldcpp does not work, try oldpc version instead (Cuda11 + AVX1).
If you don't have an NVIDIA GPU, or do not need CUDA, you can use the nocuda version which is smaller.
If you're using AMD, we recommend trying the Vulkan option in the nocuda build for best support.
If you're on a modern MacOS (M-Series) you can use the koboldcpp-mac-arm64 MacOS binary.
Click here for .gguf conversion and quantization tools

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag. You can also refer to the readme and the wiki.

LostRuins/koboldcpp v1.106 koboldcpp-1.106 on GitHub

koboldcpp-1.106

LostRuins/koboldcpp v1.106
koboldcpp-1.106

on GitHub