few shot variant prediction - MSA sequence length and # #124
-
Hi, I'm trying to use the few shot variant prediction function recently added. I was able to run the example code, however I have not been able to run my own sequence + MSA input. I've been getting the CUDA out of memory error: "CUDA out of memory. Tried to allocate 3.77 GiB (GPU 0; 11.17 GiB total capacity; 7.23 GiB already allocated; 3.17 GiB free; 7.58 GiB reserved in total by PyTorch)" My sequence length is 536 and my MSA contains 1024 sequences. I was able to get it running when I shortened to sequence length from 536 to 263 (like the given example), but I would like to run my entire protein sequence through. I wanted to ask is there a simple method to determine the maximum sequence length I can use? Or, alternatively if I separated it into two separate runs would there be information loss? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
I worked on the ESM-1b model as could be found in esm/README.md#Usage. It worked for me when I added these codes after getting
Maybe it is enough to add the first line only. |
Beta Was this translation helpful? Give feedback.
-
Not really an easy way to know the max MSA size except trial and error I think; I expect it to be almost perfectly linear with msa size. |
Beta Was this translation helpful? Give feedback.
Not really an easy way to know the max MSA size except trial and error I think; I expect it to be almost perfectly linear with msa size.
Cutting your msa size will impact performance unfortunately.
If a bigger GPU with more memory is not a feasible solution, you can run on CPU, which is slower but typically has more (RAM) memory