github LostRuins/koboldcpp v1.61.2
koboldcpp-1.61.2

latest releases: v1.75.2, v1.75.1, v1.75...
6 months ago

koboldcpp-1.61.2

Finally multimodal edition

image
image

  • NEW: KoboldCpp now supports Vision via Multimodal Projectors (aka LLaVA), allowing it to perceive and react to images! Load a suitable --mmproj file or select it in the GUI launcher to use vision capabilities. (Not working on Vulkan)
    • Note: This is NOT limited to only LLaVA models, any compatible model of the same size and architecture can gain vision capabilities!
    • Simply grab a 200mb mmproj file for your architecture here, load it with --mmproj and stick it into your favorite compatible model, and it will be able to see images as well!
    • KoboldCpp supports passing up to 4 images, each one will consume about 600 tokens of context (LLaVA 1.5). Additionally, KoboldCpp token fast-forwarding and context-shifting works with images seamlessly, so you only need to process each image once!
    • A compatible OpenAI GPT-4V API endpoint is emulated, so GPT-4-Vision applications should work out of the box (e.g. for SillyTavern in Chat Completions mode, just enable it). For Kobold API and OpenAI Text-Completions API, passing an array of base64 encoded images in the submit payload will work as well (planned Aphrodite compatible format).
    • An A1111 compatible /sdapi/v1/interrogate endpoint is also emulated, allowing easy captioning for other image-interrogation frontends.
    • In Kobold Lite, click any image to select from available AI Vision options.
  • NEW: Support for authentication via API Keys has been added, set it with --password. This key will be required for all text generation endpoints, using Bearer Authorization. Image endpoints are not secured.
  • Proper support for generating non-square images, scaling correctly based on aspect ratio
  • --benchmark limit increased to 16k context
  • Added aliases for the image sampler names for txt2img generation.
  • Added the clamped option for --sdconfig which prevents generating too large resolutions and potentially crashing due to OOM.
  • Pulled and merged improvements and fixes from upstream
    • Includes support for mamba models, (CPU only). Note: mamba does not support context shifting
  • Updated Kobold Lite:
    • Added better support for displaying larger images, added support for generating portrait and landscape aspect ratios
    • Increased max image resolution in HD mode, allow downloading non-square images properly
    • Added ability to choose image samplers for image generation
    • Added ability to upload images to KoboldCpp for LLaVA usage, with 4 selectable "AI Vision" modes
    • Allow inserting images from files even when no image generation backend is selected
    • Added support for password input and using API keys over KoboldAI API

Fix 1.61.1 - Fixed mamba (removed broken context shifting), merged other fixes from upstream, support uploading non-square images.
Fix 1.61.2 - Added new launch flag --ignoremissing which deliberately ignores any optional missing files that were passed in, e.g. --lora, --mmproj, skipping them instead of exiting. Also, paste image from clipboard is added to lite.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

Don't miss a new koboldcpp release

NewReleases is sending notifications on new releases.