koboldcpp-1.2
koboldcpp-1.2
This is a checkpoint version which should be relatively stable and includes more release variants.
- Support for new versions of GPT2 models, for example the Cerebras models on HF.
- Prevented the TK GUI window from staying open and being annoying.
To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
Alternatively, drag and drop a compatible ggml model on top of the .exe, or run it and manually select the model in the popup dialog.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001
Alternative Options:
If your CPU is very old and doesn't support AVX2 instructions, you can try running the noavx2 version. It will be slower.
If you prefer, you can download the zip file, extract and run the python script e.g. koboldcpp.py [ggml_model.bin]
manually
To quantize an fp16 model, you can use the quantize.exe in the tools.zip