koboldcpp-1.56

NEW: Added early support for new Vulkan GPU backend by @0cc4m. You can try it out with the command --usevulkan (gpu id) or via the GUI launcher. Now included with the Windows and Linux prebuilt binaries. (Note: Mixtral on Vulkan not fully supported)
Updated and merged the new GGML backend rework from upstream. This update includes many extensive fixes, improvements and changes across over a hundred commits. Support for earlier non-gguf models has been preserved via a fossilized earlier version of the library. Please open an issue if you encounter problems. The Wiki and Readme have been updated too.
Added support for setting dynatemp_exponent, previously was defaulted at 1.0. Support added over API and in Lite.
Fixed issues with Linux CUDA on Pascal, added more flags to handle conda and colab builds correctly.
Added support for Old CPU fallbacks (NoAVX2 and Failsafe modes) in build targets in the Linux prebuilt binary (and koboldcpp.sh)
Added missing 48k context option, fixed clearing file selection, better abort handling support, fixed aarch64 termux builds, various other fixes.
Updated Kobold Lite with many improvements and new features:
- NEW: Added XTTS API Server support (Local AI powered text-to-speech).
- Added option to let AI impersonate you for a turn in a chat.
- HD image generation options.
- Added popup-on-complete browser notification options.
- Improved DynaTemp wizard, added options to set exponent
- Bugfixes, padding adjustments, A1111 parameter fixes, image color fixes for invert color mode.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

LostRuins/koboldcpp v1.56 koboldcpp-1.56 on GitHub

koboldcpp-1.56

LostRuins/koboldcpp v1.56
koboldcpp-1.56

on GitHub