-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ceda-icompress python implementations #232
Comments
Hi @milankl Here's some timing results. Not quite as fast as the Julia, but also not dreadful.
Results:
|
If I don't convert the exponent:
|
Thanks @nmassey001 !!! @observingClouds could you, at some point, compare this to xbitinfo performance? |
Thanks @nmassey001 for posting these numbers and reaching out to us. I hope I find time soon to provide these numbers as well. |
A belated update to this: I've finally got around to implementing the |
I believe this memory issue is something general we have to work on. Technically the algorithm should be almost allocation free as mentioned above (only a small counter array has to be allocated) but python seems to do something else. I don't know enough about python to easily identify where it allocates and why but this problem seems to get prohibitive for larger datasets. @nmassey001 could you measure your memory allocations too? Maybe this helps @observingClouds to understand why we also seem to copy the array, which we really shouldn't. |
@nmassey001 just reached out to point towards ceda-icompress another python implementation of the bitinformation algorithm and bit rounding.
The package is xarray-free and I'm curious to know differences in performance as we've been suffering from allocations
A 400MB Float32 array
reaches 16GB/s on my macbook air and (afaik) is memory bounded at that point (max bandwidth of reading from RAM).
The bitinformation algorithm here is essentially allocation free as only the counter array has to be allocated while counting all 00,01,10,11 combinations in the data set. It reaches about 70MB/s and is at that stage on a similar order of magnitude as lossless codecs at higher compression levels.
Neil, would you mind throwing in a similar quick benchmark of ceda-icompress?
The text was updated successfully, but these errors were encountered: