koboldcpp-1.56
- NEW: Added early support for new Vulkan GPU backend by @0cc4m. You can try it out with the command
--usevulkan (gpu id)
or via the GUI launcher. Now included with the Windows and Linux prebuilt binaries. (Note: Mixtral on Vulkan not fully supported) - Updated and merged the new GGML backend rework from upstream. This update includes many extensive fixes, improvements and changes across over a hundred commits. Support for earlier non-gguf models has been preserved via a fossilized earlier version of the library. Please open an issue if you encounter problems. The Wiki and Readme have been updated too.
- Added support for setting
dynatemp_exponent
, previously was defaulted at 1.0. Support added over API and in Lite. - Fixed issues with Linux CUDA on Pascal, added more flags to handle conda and colab builds correctly.
- Added support for Old CPU fallbacks (NoAVX2 and Failsafe modes) in build targets in the Linux prebuilt binary (and koboldcpp.sh)
- Added missing 48k context option, fixed clearing file selection, better abort handling support, fixed aarch64 termux builds, various other fixes.
- Updated Kobold Lite with many improvements and new features:
- NEW: Added XTTS API Server support (Local AI powered text-to-speech).
- Added option to let AI impersonate you for a turn in a chat.
- HD image generation options.
- Added popup-on-complete browser notification options.
- Improved DynaTemp wizard, added options to set exponent
- Bugfixes, padding adjustments, A1111 parameter fixes, image color fixes for invert color mode.
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here
Run it from the command line with the desired launch parameters (see --help
), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help
flag.