koboldcpp-1.61.2

Finally multimodal edition

NEW: KoboldCpp now supports Vision via Multimodal Projectors (aka LLaVA), allowing it to perceive and react to images! Load a suitable --mmproj file or select it in the GUI launcher to use vision capabilities. (Not working on Vulkan)
- Note: This is NOT limited to only LLaVA models, any compatible model of the same size and architecture can gain vision capabilities!
- Simply grab a 200mb mmproj file for your architecture here, load it with --mmproj and stick it into your favorite compatible model, and it will be able to see images as well!
- KoboldCpp supports passing up to 4 images, each one will consume about 600 tokens of context (LLaVA 1.5). Additionally, KoboldCpp token fast-forwarding and context-shifting works with images seamlessly, so you only need to process each image once!
- A compatible OpenAI GPT-4V API endpoint is emulated, so GPT-4-Vision applications should work out of the box (e.g. for SillyTavern in Chat Completions mode, just enable it). For Kobold API and OpenAI Text-Completions API, passing an array of base64 encoded images in the submit payload will work as well (planned Aphrodite compatible format).
- An A1111 compatible /sdapi/v1/interrogate endpoint is also emulated, allowing easy captioning for other image-interrogation frontends.
- In Kobold Lite, click any image to select from available AI Vision options.
NEW: Support for authentication via API Keys has been added, set it with --password. This key will be required for all text generation endpoints, using Bearer Authorization. Image endpoints are not secured.
Proper support for generating non-square images, scaling correctly based on aspect ratio
--benchmark limit increased to 16k context
Added aliases for the image sampler names for txt2img generation.
Added the clamped option for --sdconfig which prevents generating too large resolutions and potentially crashing due to OOM.
Pulled and merged improvements and fixes from upstream
- Includes support for mamba models, (CPU only). Note: mamba does not support context shifting
Updated Kobold Lite:
- Added better support for displaying larger images, added support for generating portrait and landscape aspect ratios
- Increased max image resolution in HD mode, allow downloading non-square images properly
- Added ability to choose image samplers for image generation
- Added ability to upload images to KoboldCpp for LLaVA usage, with 4 selectable "AI Vision" modes
- Allow inserting images from files even when no image generation backend is selected
- Added support for password input and using API keys over KoboldAI API

Fix 1.61.1 - Fixed mamba (removed broken context shifting), merged other fixes from upstream, support uploading non-square images.
Fix 1.61.2 - Added new launch flag --ignoremissing which deliberately ignores any optional missing files that were passed in, e.g. --lora, --mmproj, skipping them instead of exiting. Also, paste image from clipboard is added to lite.

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.

LostRuins/koboldcpp v1.61.2 koboldcpp-1.61.2 on GitHub

koboldcpp-1.61.2

LostRuins/koboldcpp v1.61.2
koboldcpp-1.61.2

on GitHub