-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up alignment 2-5x(?) by using mm2-plus #33
Comments
Thanks for letting us know! We will look into it! |
Did you use the source: https://github.com/lh3/minimap2?tab=readme-ov-file#map-long-mrnacdna-reads |
Thank you for the suggestion @jodjo86 ! We'll have a look at that! I should also say that after looking closer into mm2-plus, I realize the speedup might not be as great since a big part of it seems to be based on utilizing multiple CPU cores, which is already done in EMU by running multiple minimap2 jobs in parallel. So any remaining speedups are then probably coming from the SIMD optimizations I guess. I'm still interested in trying that out, but haven't got to it just yet. Will report back if and when! |
As far as I understand, EMU does not run multiple minimap2 jobs in parallel. The Cautionary tale: mm2 is a robust and heavily documented tool, I would wait to see what happens with mm2-plus before implementing it. I'm just an EMU user but I hope this helps. |
Enhancement suggestion
It seems it might be possible to speed up the most resource intensive part of EMU (the alignment part done by minimap2) by switching minimap2 to this new improved drop-in replacement: https://github.com/at-cg/mm2-plus
They report speedups of around 2-5x, from what I can see in the graphs, depending on the dataset, although that includes spreading out the workload on multiple CPU cores, which I understand might not make such a big difference in EMU, since EMU can leverage multiple CPU cores by running multiple minimap2 jobs in parallel anyways if I understand correctly(?)
Anyways, there is a preprint about the tool here: https://www.biorxiv.org/content/10.1101/2024.11.25.625328v1
Motivation
The resource requirements for EMU right now are somewhat demanding, which might hinder fast response times depending on the amount of samples and available compute power.
In our rough tests we have seen resource requirements in the ballpark of:
The text was updated successfully, but these errors were encountered: