Skip to content

Releases: guinmoon/LLMFarm

v0.6.0

26 Sep 19:02
Compare
Choose a tag to compare

Changes

  • add grammar sampling for llama models, you can put .gbnf files to the grammars directory
  • llama.cpp updated to b1256
  • rwkv updated to 8db73b1
  • gpt-2 updated
  • rwkv_eval_sequence 20% increase speed
  • handle GGML_ASSERT
  • fixed many errors
  • new llama2 and saiga template

** Due to error ScrollViewReader, autoscroll is disabled on iOS <16.4

v0.5.2

11 Sep 16:28
Compare
Choose a tag to compare

Changes

  • Added mmap and mlock options
  • Added prompt format text editor with multiline support
  • Added tfs_z and typical_p sampling parameters
  • Fixed many errors on model loading.
  • Fixed scrolling issue
  • UI Improvements
  • Templates improvments

** Due to error ScrollViewReader, autoscroll is disabled on iOS <16.4

v0.5.0

02 Sep 21:07
Compare
Choose a tag to compare

Changes

  • llama.cpp updated to b1132, GGUF format support and increase in the speed. The old file format is still supported but uses llama dadbed9.
  • Add Falcon models support (only GGUF)
  • Add template for RWKV-4
  • Fix model rename
  • Fixed some UI bugs that could cause the app to crash.
  • Fix llama, replit token_to_str

** In order to use llama.cpp b1132, the model file must have the .gguf extension.
*** Unfortunately, due to a bug in the latest versions of llama.cpp, Metal not supported on intel Macs at this time.

v0.4.5

26 Aug 16:02
Compare
Choose a tag to compare

Changes

  • llama.cpp updated to dadbed9, A noticeable increase in the speed of Metal on iOS. Now 7B qK_3 model works fine on iphone 12
  • Add models management
  • Add template for run LLaMA 2 on iPhone
  • Fix template set context size

Now you can install LLMFarm on iOS devices with TestFlight

** llmfarm_core has been moved to a separate repository. To build LLMFarm, you need to clone this repository recursively:

git clone --recurse-submodules https://github.com/guinmoon/LLMFarm

v0.4.2

11 Aug 18:09
Compare
Choose a tag to compare

Changes

  • Mirostat , Greedy sampling now available for non llama inferences

v0.4.0

07 Aug 08:51
Compare
Choose a tag to compare

Changes

  • Add RWKV inference support (now only 20b tokenizer). Tested on this models.

v0.3.2

26 Jul 11:24
Compare
Choose a tag to compare

Changes

  • llama.cpp updated to 84e09a7d8bc4ab6d658b5cd81295ac0add60be78
    Noticeable increase in speed for 3B models on iOS with Metal
  • QKK_64 Build Can be used for quantization 3B models with k_quants
    See more details here
  • Add reverse prompt option to stop prediction
  • Add predict time to message

v0.3.0

18 Jul 15:15
Compare
Choose a tag to compare

Changes

  • Add Starcoder(Santacoder) inference, tested on this model
  • Add Model settings Templates, for quick setup of prompt format and model parameters
  • llama.cpp and ggml updated
  • UI Improvements

v0.2.2

04 Jul 17:03
Compare
Choose a tag to compare

Changes

  • Fixed Metal K_quants
  • UI Improvements

v0.2.1

02 Jul 17:38
Compare
Choose a tag to compare

Changes

  • Added Metal support for llama inference (not for K_Quant), Metal must be enabled in the model settings.