github LostRuins/koboldcpp v1.57.1
koboldcpp-1.57.1

latest releases: v1.75.2, v1.75.1, v1.75...
7 months ago

koboldcpp-1.57.1

  • Added a benchmarking feature with --benchmark, which automatically runs a benchmark with your provided settings, outputting run parameters, timing and speed information as well as testing for coherence, and exiting on completion. You can provide a filename e.g. --benchmark result.csv and it will write CSV formatted data appended to that file.
  • Added temperature Quad-Sampling (set via API with parameter smoothing_factor) PR from @AAbushady, (credits @kalomaze).
  • Improved timing displays. Also, displays the seed used, and also shows llama.cpp styled timings when run in --debugmode. The timings will appear faster as they do not include overheads, measuring only specific eval functions.
  • Improved abort generation behavior (allows second user aborting while in queue)
  • Vulkan enhancements from @0cc4m merged: APU memory handling and multigpu. To use multigpu, you can now specify additional IDs, for example --usevulkan 0 2 3 which will use GPUs with IDs 0,2, and 3. Allocation is determined by --tensor_split. Multigpu for Vulkan is currently configurable via commandline only, the GUI launcher does not allow selecting multiple devices for Vulkan.
  • Various improvements and bugfixes merged from upstream.
  • Updated Kobold Lite with many improvements and new features:
    • NEW: The Aesthetic UI is now available for Story and Adventure modes as well!
    • Added "AI Impersonate" feature for Instruct mode.
    • Smoothing factor added, can be configured in dynamic temperature panel.
    • Added a toggle to enable printable view (unlock vertical scrolling).
    • Added a toggle to inject timestamps, allowing the AI to be aware of time passing.
    • Persist API info for A1111 and XTTS, allows specifying custom negative prompts for image gen, allows specifying custom horde keys in KCPP mode.
    • Fixes for XTTS to handle devices with over 100 voices, and also adds an option to narrate dialogue only.
    • Toggle to request A1111 backend to save generated images to disk.
    • Fix for chub.ai card fetching.

Hotfix1.57.1: Fixed some crashes and fixed multigpu for vulkan.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Don't miss a new koboldcpp release

NewReleases is sending notifications on new releases.