Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify requirements #12

Open
vid opened this issue Mar 11, 2023 · 8 comments
Open

Clarify requirements #12

vid opened this issue Mar 11, 2023 · 8 comments

Comments

@vid
Copy link

vid commented Mar 11, 2023

Hi, I am ordering some RAM to work with LLAMA when I take a break in a few weeks. The README for this repo says "64 or better 128 Gb of RAM (192 or 256 would be perfect)". Is this alongside a CUDA card? I have a 3090. I can order up to 192GB of RAM, if it makes a big difference. Will it?

Thanks!

@randaller
Copy link
Owner

Hi @vid!
30B model uses around 70 Gb of RAM. 7B model fits into 18 Gb. 13B model uses 48 Gb. While models are loading, they need double of this values for a short time (swap file handles it well, then releases). I'm on a 128 Gb and it's a bit not enough to hold 65B model which uses about 140 Gb of RAM.

Totally 128 Gb of RAM, I assume, is ok. Moreover, you can not install more than 128 Gb in a typical desktop, even i9-13900k supports only 128 Gb max. The systems that allow more RAM, immediately costs twice or more.

CUDA card is not so important for this repo, it just runs LLaMA layer by layer, so may be even 1080ti could handle this.

If you have 3090ti, may be it is better to find another repos that would act faster with your cool card and doesn't require so much RAM.

@vid
Copy link
Author

vid commented Mar 11, 2023

Thank you for that response!
Some companies, like Gigabyte, now support 48GB DDR5 modules on their LGA1700 models. Crucial currently has 192GB DDR5 7000mhz for $700. So it becomes a lot more practical.
I don't mind spending the money if it makes it possible/easier/faster to explore in different directions. And of course, CUDA RAM is very expensive.

@breadbrowser
Copy link

Thank you for that response!
Some companies, like Gigabyte, now support 48GB DDR5 modules on their LGA1700 models. Crucial currently has 192GB DDR5 7000mhz for $700. So it becomes a lot more practical.
I don't mind spending the money if it makes it possible/easier/faster to explore in different directions. And of course, CUDA RAM is very expensive.

What about 64gig ecc server ram

@vid
Copy link
Author

vid commented Mar 12, 2023

If you really need ECC RAM you could buy a server or workstation class system, maybe used. DDR RAM has on-chip error checking. Normally ECC isn't that important, worst case you can run a program twice to see if you get the same result, though it's even more questionable with DL.

@tallesairan
Copy link

What would be the amount of gigs of vram to run this model on a GPU? I'm considering buying an A100 with the company's resources 😈 :trollface:

@randaller
Copy link
Owner

randaller commented Mar 14, 2023

What would be the amount of gigs of vram to run this model on a GPU?

@tallesairan 1 Gb on a GeForce 710 should be enough :trollface:, just it will be a little bit slowly. This repo feeds layers one by one and the greatest layer is tokenizer which is 500 Mb :)

@nopium
Copy link

nopium commented Mar 15, 2023

Is it possible to trade off the lack of RAM in favor of GPU RAM ?
Meaning if I have 32Gb RAM and 24Gb of GPU what model size can I run?

@randaller
Copy link
Owner

@nopium HF version allows GPU offloading, but we still need a lot of RAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants