-
Notifications
You must be signed in to change notification settings - Fork 40
Zarr reader #271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zarr reader #271
Conversation
Bit of an update. With the help from @sharkinsspatial @abarciauskas-bgse and @maxrjones I got a Zarr loaded as a virtual dataset. <xarray.Dataset> Size: 3kB
Dimensions: (time: 10, lat: 9, lon: 18)
Coordinates:
lat (lat) float32 36B ManifestArray<shape=(9,), dtype=float32, chunk...
lon (lon) float32 72B ManifestArray<shape=(18,), dtype=float32, chun...
* time (time) datetime64[ns] 80B 2013-01-01 ... 2013-01-03T06:00:00
Data variables:
air (time, lat, lon) int16 3kB ManifestArray<shape=(10, 9, 18), dtyp...
Attributes:
Conventions: COARDS
title: 4x daily NMC reanalysis (1948)
description: Data is from NMC initialized reanalysis\n(4x/day). These a...
platform: Model
references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly... Next up is how to deal with When I try to write it to Kerchunk JSON, I’m running into some
Where Wondering how There seems to be some conversion logic in @sharkinsspatial's HDF reader for converting fill_values . It also looks like there is some fill_value handling in zarr.py. |
Amazing!
This seems like an issue that should actually be orthogonal to this PR (if it weren't for the ever-present difficulty of testing). Either the problem is in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a a great start! I think the main thing here is that we don't actually need kerchunk in order to test this reader.
I think we should just do this in this PR. We can point to Tom's PR for now in the CI, but I expect that will get merged before this does anyway. If you look at Tom's implementation it's basically what we're doing here. |
Co-authored-by: Tom Nicholas <[email protected]>
|
for more information, see https://pre-commit.ci
* Use ManifestStore in Zarr reader * Update virtualizarr/readers/zarr.py Co-authored-by: Raphael Hagen <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Raphael Hagen <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience @norlandrhagen , and for your help @maxrjones .
I believe I spotted one small bug, but otherwise this looks great.
The implementation is very neat now!
@@ -77,7 +77,7 @@ vds.virtualize.to_icechunk(icechunkstore) | |||
|
|||
### I already have some data in Zarr, do I have to resave it? | |||
|
|||
No! VirtualiZarr can (well, [soon will be able to](https://github.com/zarr-developers/VirtualiZarr/issues/262)) create virtual references pointing to existing Zarr stores in the same way as for other file formats. | |||
No! VirtualiZarr can create virtual references pointing to existing Zarr stores in the same way as for other file formats. Note: Currently only reading Zarr V3 is supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an issue to track learning to read zarr v2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Tom Nicholas <[email protected]>
Co-authored-by: Tom Nicholas <[email protected]>
Co-authored-by: Tom Nicholas <[email protected]>
Co-authored-by: Tom Nicholas <[email protected]>
for more information, see https://pre-commit.ci
WIP PR to add a Zarr reader. Thanks to @TomNicholas for the how to write a reader guide.
docs/releases.rst
Future PR(s):
To Do:
For each virtual variable:
Replace VirtualiZarr.ZArray with zarr ArrayMetadata #175)
Get the loadable_variables by just using xr.open_zarr on the same store (should use drop_variables to avoid handling the virtual variables that we already have).