Skip to content

v1.3.0

Compare
Choose a tag to compare
@guinmoon guinmoon released this 24 Jun 15:52
· 50 commits to main since this release

Changes:

  • LLaMA.cpp updated to b3190
  • Added support for DeepseekV2, GPTNeoX (Pythia and others)
  • Added support for Markdown formatting
  • Added support for using history in Shortcuts
  • Added Flash Attention support
  • Added NPredict option
  • Metal and CPU inference improvements
  • Sampling and eval improvements
  • Some fixes for phi-3 and MiniCPM
  • Fixed some errors
  • Added Qwen template