koboldcpp-1.61.2
koboldcpp-1.61.2
Finally multimodal edition
- NEW: KoboldCpp now supports Vision via Multimodal Projectors (aka LLaVA), allowing it to perceive and react to images! Load a suitable
--mmproj
file or select it in the GUI launcher to use vision capabilities. (Not working on Vulkan)- Note: This is NOT limited to only LLaVA models, any compatible model of the same size and architecture can gain vision capabilities!
- Simply grab a 200mb mmproj file for your architecture here, load it with
--mmproj
and stick it into your favorite compatible model, and it will be able to see images as well! - KoboldCpp supports passing up to 4 images, each one will consume about 600 tokens of context (LLaVA 1.5). Additionally, KoboldCpp token fast-forwarding and context-shifting works with images seamlessly, so you only need to process each image once!
- A compatible OpenAI GPT-4V API endpoint is emulated, so GPT-4-Vision applications should work out of the box (e.g. for SillyTavern in Chat Completions mode, just enable it). For Kobold API and OpenAI Text-Completions API, passing an array of base64 encoded
images
in the submit payload will work as well (planned Aphrodite compatible format). - An A1111 compatible
/sdapi/v1/interrogate
endpoint is also emulated, allowing easy captioning for other image-interrogation frontends. - In Kobold Lite, click any image to select from available AI Vision options.
- NEW: Support for authentication via API Keys has been added, set it with
--password
. This key will be required for all text generation endpoints, usingBearer
Authorization. Image endpoints are not secured. - Proper support for generating non-square images, scaling correctly based on aspect ratio
--benchmark
limit increased to 16k context- Added aliases for the image sampler names for txt2img generation.
- Added the
clamped
option for--sdconfig
which prevents generating too large resolutions and potentially crashing due to OOM. - Pulled and merged improvements and fixes from upstream
- Includes support for mamba models, (CPU only). Note: mamba does not support context shifting
- Updated Kobold Lite:
- Added better support for displaying larger images, added support for generating portrait and landscape aspect ratios
- Increased max image resolution in HD mode, allow downloading non-square images properly
- Added ability to choose image samplers for image generation
- Added ability to upload images to KoboldCpp for LLaVA usage, with 4 selectable "AI Vision" modes
- Allow inserting images from files even when no image generation backend is selected
- Added support for password input and using API keys over KoboldAI API
Fix 1.61.1 - Fixed mamba (removed broken context shifting), merged other fixes from upstream, support uploading non-square images.
Fix 1.61.2 - Added new launch flag --ignoremissing
which deliberately ignores any optional missing files that were passed in, e.g. --lora
, --mmproj
, skipping them instead of exiting. Also, paste image from clipboard is added to lite.
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here
Run it from the command line with the desired launch parameters (see --help
), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
For more information, be sure to run the program from command line with the --help
flag.