-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rapidgzip support? #164
Comments
This is rather interesting. Due to the overhead of creating subprocesses I do wonder if this will be faster than the python-isal backend in most use cases. Since rapidgzip uses ISA-L it shouldn't be that much slower. Maybe some of the algorithms could be ported to python-isal? Thanks for posting the link. |
Oh yes, I forgot to mention, but there are some things to be aware of. I am a bioinformatician, so I can really only work on things that I can justify putting the time in. For Bioinformatic workloads the bottleneck is usually not the decompression if it can be offloaded to another thread using ISA-L. This option is already avalaible. If more speed is needed, most of our gzip files are actually bgzip files. They can be decompressed using multithreading in a much simpler way. The reason I haven't got around to doing that is that the priority is very low because I simply haven't run into cases were decompression speed is bottlenecking. For rapidgzip inclusion the priority would be even lower. If somebody else is willing to do it however, I am willing to do the code review etc. It shouldn't be too hard. Pbzip2 is also integrated and this can be taken as a template. |
Thank you much for the information. I will try to use bgzip format instead to speed up. |
https://github.com/mxmlnkn/rapidgzip
The text was updated successfully, but these errors were encountered: