Skip to content

koboldcpp-1.74

Compare
Choose a tag to compare
@LostRuins LostRuins released this 31 Aug 03:41
· 563 commits to concedo since this release

koboldcpp-1.74

Kobo's all grown up now

image

  • NEW: Added XTC (Exclude Top Choices) sampler, a brand new creative writing sampler designed by the same author of DRY (@p-e-w). To use it, increase xtc_probability above 0 (recommended values to try: xtc_threshold=0.15, xtc_probability=0.5)
  • Added automatic image resizing and letterboxing for llava/minicpm images, this should improve handling of oddly-sized images.
  • Added a new flag --nomodel which allows launching the Lite WebUI without loading any model at all. You can then select an external api provider like Horde, Gemini or OpenAI
  • MacOS defaults to full offload when -1 gpulayers selected
  • Minor tweaks to context shifting thresholds
  • Horde Worker now has a 5 minute timeout for each request, which should reduce the likelihood of getting stuck (e.g. internet issues). Also, horde worker now supports connecting to SSL secured Kcpp instances (remember to enable --nocertify if using self signed certs)
  • Updated Kobold Lite, multiple fixes and improvements
  • Merged fixes and improvements from upstream (plus Llama-3.1-Minitron-4B-Width support)

To use, download and run the koboldcpp.exe, which is a one-file pyinstaller.
If you don't need CUDA, you can use koboldcpp_nocuda.exe which is much smaller.
If you have an Nvidia GPU, but use an old CPU and koboldcpp.exe does not work, try koboldcpp_oldcpu.exe
If you have a newer Nvidia GPU, you can use the CUDA 12 version koboldcpp_cu12.exe (much larger, slightly faster).
If you're using Linux, select the appropriate Linux binary file instead (not exe).
If you're on a modern MacOS (M1, M2, M3) you can try the koboldcpp-mac-arm64 MacOS binary.
If you're using AMD, you can try koboldcpp_rocm at YellowRoseCx's fork here

Run it from the command line with the desired launch parameters (see --help), or manually select the model in the GUI.
and then once loaded, you can connect like this (or use the full koboldai client):
http://localhost:5001

For more information, be sure to run the program from command line with the --help flag.