koboldcpp-1.57.1

Added a benchmarking feature with --benchmark, which automatically runs a benchmark with your provided settings, outputting run parameters, timing and speed information as well as testing for coherence, and exiting on completion. You can provide a filename e.g. --benchmark result.csv and it will write CSV formatted data appended to that file.
Added temperature Quad-Sampling (set via API with parameter smoothing_factor) PR from @AAbushady, (credits @kalomaze).
Improved timing displays. Also, displays the seed used, and also shows llama.cpp styled timings when run in --debugmode. The timings will appear faster as they do not include overheads, measuring only specific eval functions.
Improved abort generation behavior (allows second user aborting while in queue)
Vulkan enhancements from @0cc4m merged: APU memory handling and multigpu. To use multigpu, you can now specify additional IDs, for example --usevulkan 0 2 3 which will use GPUs with IDs 0,2, and 3. Allocation is determined by --tensor_split. Multigpu for Vulkan is currently configurable via commandline only, the GUI launcher does not allow selecting multiple devices for Vulkan.
Various improvements and bugfixes merged from upstream.
Updated Kobold Lite with many improvements and new features:
- NEW: The Aesthetic UI is now available for Story and Adventure modes as well!
- Added "AI Impersonate" feature for Instruct mode.
- Smoothing factor added, can be configured in dynamic temperature panel.
- Added a toggle to enable printable view (unlock vertical scrolling).
- Added a toggle to inject timestamps, allowing the AI to be aware of time passing.
- Persist API info for A1111 and XTTS, allows specifying custom negative prompts for image gen, allows specifying custom horde keys in KCPP mode.
- Fixes for XTTS to handle devices with over 100 voices, and also adds an option to narrate dialogue only.
- Toggle to request A1111 backend to save generated images to disk.
- Fix for chub.ai card fetching.

Hotfix1.57.1: Fixed some crashes and fixed multigpu for vulkan.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

LostRuins/koboldcpp v1.57.1 koboldcpp-1.57.1 on GitHub

koboldcpp-1.57.1

LostRuins/koboldcpp v1.57.1
koboldcpp-1.57.1

on GitHub