Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for compressed and shuffled datasets on local storage #118

Closed
markgoddard opened this issue Jul 6, 2023 · 1 comment · Fixed by #119
Closed

Add support for compressed and shuffled datasets on local storage #118

markgoddard opened this issue Jul 6, 2023 · 1 comment · Fixed by #119
Assignees
Labels
enhancement New feature or request

Comments

@markgoddard
Copy link

markgoddard commented Jul 6, 2023

Currently if a dataset is compressed or has any filters applied, PyActiveStorage will raise NotImplementedError in storage.py.

There are many compression and filter algorithms supported by netCDF/HDF5, but this task covers supporting and testing the Zlib compression algorithm as well as the HDF5 Shuffle filter. This task covers support for local storage. S3 will be addressed separately.

@markgoddard markgoddard added the enhancement New feature or request label Jul 6, 2023
@markgoddard markgoddard self-assigned this Jul 6, 2023
markgoddard added a commit that referenced this issue Jul 6, 2023
This change adds support for compressed and filtered data for local
storage. Data in S3 will be addressed separately.

The compression and filters arguments passed to reduce_chunk are
actually numcodecs.abc.Codec instances, so we can use them as a black
box to decode the compression or filter.

Currently we are testing Zlib compression algorithm as well as the HDF5
byte shuffle filter. It's possible that other compression algorithms and
filters will "just work" due to using the numcodecs.abc.Codec interface
to decode the data, but they have not been tested.

Closes: #118
@valeriupredoi
Copy link
Collaborator

valeriupredoi commented Jul 20, 2023

this is indeed now done and tested (cheers very much @markgoddard ) - testing done on real CMIP6 and obs4MIPS (compressed, shuffle=False) too. Worth noting that there was some discussion to possibly follow up from #119 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants