-
-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GroupReadsByUmi long runtime #944
Comments
It may be the case you have extremely high coverage of each template and/or genomic coordinate? Can you check if you have provided enough memory by looking at the memory usage of the process? |
I have tried using a large amount of memory (up to 100GB). Would adding multithreading to this step be an option for future development? Similar to what is available in the CallMolecularConsensus step? |
It definitely looks like you have high coverage in that region, which makes it tough. Not knowing your UMI length(s), you may have very high per-molecule coverage. It's not too much code, so I think both porting this to rust (like we have for other tools) as well as incorporating other advances since the time we originally wrote the tool can dramatically speed things up and perhaps reduce memory. We would be glad for folks to sponsor that work. |
I understand. Thank you for the reply. |
Running this command with v2.1.0:
Why did the first 2M reads take ~8 hours to group?
I have several samples all around 100M reads. Some process quickly as expected and others hang as this one does. I have no idea why this is happening.
The text was updated successfully, but these errors were encountered: