GitHub - DifferentialityDevelopment/Anima: Moved to here: https://github.com/lyogavin/airllm

AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning. And you can run 405B Llama3.1 on 8GB vram now.

Moved to here: https://github.com/lyogavin/airllm

Name	Name	Last commit message	Last commit date
Latest commit lyogavin Update README.md Aug 1, 2024 bc7123f · Aug 1, 2024 History 5 Commits
air_llm	air_llm	Update README.md	Aug 1, 2024
README.md	README.md	Update README.md	Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

DifferentialityDevelopment/Anima

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages