Clarify requirements #12

vid · 2023-03-11T16:50:53Z

Hi, I am ordering some RAM to work with LLAMA when I take a break in a few weeks. The README for this repo says "64 or better 128 Gb of RAM (192 or 256 would be perfect)". Is this alongside a CUDA card? I have a 3090. I can order up to 192GB of RAM, if it makes a big difference. Will it?

Thanks!

randaller · 2023-03-11T17:04:05Z

Hi @vid!
30B model uses around 70 Gb of RAM. 7B model fits into 18 Gb. 13B model uses 48 Gb. While models are loading, they need double of this values for a short time (swap file handles it well, then releases). I'm on a 128 Gb and it's a bit not enough to hold 65B model which uses about 140 Gb of RAM.

Totally 128 Gb of RAM, I assume, is ok. Moreover, you can not install more than 128 Gb in a typical desktop, even i9-13900k supports only 128 Gb max. The systems that allow more RAM, immediately costs twice or more.

CUDA card is not so important for this repo, it just runs LLaMA layer by layer, so may be even 1080ti could handle this.

If you have 3090ti, may be it is better to find another repos that would act faster with your cool card and doesn't require so much RAM.

vid · 2023-03-11T17:13:32Z

Thank you for that response!
Some companies, like Gigabyte, now support 48GB DDR5 modules on their LGA1700 models. Crucial currently has 192GB DDR5 7000mhz for $700. So it becomes a lot more practical.
I don't mind spending the money if it makes it possible/easier/faster to explore in different directions. And of course, CUDA RAM is very expensive.

breadbrowser · 2023-03-11T20:23:47Z

Thank you for that response!
Some companies, like Gigabyte, now support 48GB DDR5 modules on their LGA1700 models. Crucial currently has 192GB DDR5 7000mhz for $700. So it becomes a lot more practical.
I don't mind spending the money if it makes it possible/easier/faster to explore in different directions. And of course, CUDA RAM is very expensive.

What about 64gig ecc server ram

vid · 2023-03-12T00:55:27Z

If you really need ECC RAM you could buy a server or workstation class system, maybe used. DDR RAM has on-chip error checking. Normally ECC isn't that important, worst case you can run a program twice to see if you get the same result, though it's even more questionable with DL.

tallesairan · 2023-03-14T17:17:36Z

What would be the amount of gigs of vram to run this model on a GPU? I'm considering buying an A100 with the company's resources 😈

randaller · 2023-03-14T17:45:23Z

What would be the amount of gigs of vram to run this model on a GPU?

@tallesairan 1 Gb on a GeForce 710 should be enough , just it will be a little bit slowly. This repo feeds layers one by one and the greatest layer is tokenizer which is 500 Mb :)

nopium · 2023-03-15T14:48:59Z

Is it possible to trade off the lack of RAM in favor of GPU RAM ?
Meaning if I have 32Gb RAM and 24Gb of GPU what model size can I run?

randaller · 2023-03-19T14:48:17Z

@nopium HF version allows GPU offloading, but we still need a lot of RAM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify requirements #12

Clarify requirements #12

vid commented Mar 11, 2023

randaller commented Mar 11, 2023

vid commented Mar 11, 2023 •

edited

Loading

breadbrowser commented Mar 11, 2023

vid commented Mar 12, 2023

tallesairan commented Mar 14, 2023

randaller commented Mar 14, 2023 •

edited

Loading

nopium commented Mar 15, 2023

randaller commented Mar 19, 2023

Clarify requirements #12

Clarify requirements #12

Comments

vid commented Mar 11, 2023

randaller commented Mar 11, 2023

vid commented Mar 11, 2023 • edited Loading

breadbrowser commented Mar 11, 2023

vid commented Mar 12, 2023

tallesairan commented Mar 14, 2023

randaller commented Mar 14, 2023 • edited Loading

nopium commented Mar 15, 2023

randaller commented Mar 19, 2023

vid commented Mar 11, 2023 •

edited

Loading

randaller commented Mar 14, 2023 •

edited

Loading