Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rapidgzip support? #164

Open
y9c opened this issue Nov 18, 2024 · 3 comments
Open

Add rapidgzip support? #164

y9c opened this issue Nov 18, 2024 · 3 comments

Comments

@y9c
Copy link

y9c commented Nov 18, 2024

https://github.com/mxmlnkn/rapidgzip

@rhpvorderman
Copy link
Collaborator

This is rather interesting. Due to the overhead of creating subprocesses I do wonder if this will be faster than the python-isal backend in most use cases. Since rapidgzip uses ISA-L it shouldn't be that much slower. Maybe some of the algorithms could be ported to python-isal? Thanks for posting the link.

@rhpvorderman
Copy link
Collaborator

rhpvorderman commented Nov 18, 2024

Oh yes, I forgot to mention, but there are some things to be aware of. I am a bioinformatician, so I can really only work on things that I can justify putting the time in. For Bioinformatic workloads the bottleneck is usually not the decompression if it can be offloaded to another thread using ISA-L. This option is already avalaible. If more speed is needed, most of our gzip files are actually bgzip files. They can be decompressed using multithreading in a much simpler way. The reason I haven't got around to doing that is that the priority is very low because I simply haven't run into cases were decompression speed is bottlenecking. For rapidgzip inclusion the priority would be even lower.

If somebody else is willing to do it however, I am willing to do the code review etc. It shouldn't be too hard. Pbzip2 is also integrated and this can be taken as a template.

@y9c
Copy link
Author

y9c commented Nov 18, 2024

Thank you much for the information. I will try to use bgzip format instead to speed up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants