Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection to xarray-assets STAC extension pilot? #16

Open
weiji14 opened this issue Mar 20, 2023 · 3 comments
Open

Connection to xarray-assets STAC extension pilot? #16

weiji14 opened this issue Mar 20, 2023 · 3 comments

Comments

@weiji14
Copy link
Contributor

weiji14 commented Mar 20, 2023

Hi there, I've started using xpystac recently after the v0.0.1 release and really like the simple xr.open_dataset(stac_asset, engine="stac") interface that's just a single entrypoint to any STAC Asset!

Question though on what are the next steps for this? I've been reading the discussion at stac-utils/pystac#846 (comment), and also noticed https://github.com/stac-extensions/xarray-assets (cc @TomAugspurger). At the moment, xpystac=0.0.1 seems to only handle the engine parameter in xr.open_dataset:

https://github.com/jsignell/xpystac/blob/051d0ac15b42a60dfc330b6cacf5e8653d4edbd9/xpystac/core.py#L55-L63

Is the idea that other xr.open_dataset parameters like chunks, decode_cf, backend_kwargs, etc would need to be passed in by the user, or read automatically from the xarray-assets STAC extension (xref stac-extensions/xarray-assets#3) if available? This is starting to sound very similar to https://github.com/intake/intake-xarray in some ways, but I'd like to hear more thoughts on how things should 'ideally' work.

@jsignell
Copy link
Member

Wow! You are the first user!

Is the idea that other xr.open_dataset parameters like chunks, decode_cf, backend_kwargs, etc would need to be passed in by the user, or read automatically from the xarray-assets STAC extension (xref stac-extensions/xarray-assets#3) if available?

Yes that is exactly right. There is already a little of this handling for "xarray:open_kwargs" and ""xarray:storage_options" but as xarray-assets expandas I would expect to keep growing the list. If the user provides any kwargs to open_dataset directly then those would win over any that are defined in the STAC entry.

This is starting to sound very similar to https://github.com/intake/intake-xarray in some ways, but I'd like to hear more thoughts on how things should 'ideally' work.

Yes! This is heavily inspired by intake. I was originally going to work on intake-stac, but instead I was directed to the original issue where Tom wrote up this idea. My long-term goal is to have intake-stac depend on this library so that the effort of parsing STAC isn't duplicated. I'm not 100% sure how it relates to intake-xarray but yeah it is a very similar concept just coming at it from a slightly different angle.

@weiji14
Copy link
Contributor Author

weiji14 commented Mar 21, 2023

Is the idea that other xr.open_dataset parameters like chunks, decode_cf, backend_kwargs, etc would need to be passed in by the user, or read automatically from the xarray-assets STAC extension (xref stac-extensions/xarray-assets#3) if available?

Yes that is exactly right. There is already a little of this handling for "xarray:open_kwargs" and ""xarray:storage_options" but as xarray-assets expands I would expect to keep growing the list. If the user provides any kwargs to open_dataset directly then those would win over any that are defined in the STAC entry.

Cool, that sounds about right - having user-provided kwargs override what is defined in the xarray-assets STAC extension entry.

This is starting to sound very similar to https://github.com/intake/intake-xarray in some ways, but I'd like to hear more thoughts on how things should 'ideally' work.

Yes! This is heavily inspired by intake. I was originally going to work on intake-stac, but instead I was directed to the original issue where Tom wrote up this idea. My long-term goal is to have intake-stac depend on this library so that the effort of parsing STAC isn't duplicated. I'm not 100% sure how it relates to intake-xarray but yeah it is a very similar concept just coming at it from a slightly different angle.

Nice! My take on it is that xpystac would provide the function-then-data interface xr.open_dataset(stac_asset, ...), whereas intake-stac would provide the data-then-function interface stac_asset.to_xarray_dataset(), xref stac-utils/pystac#846 (comment)?

@jsignell
Copy link
Member

My take on it is that xpystac would provide the function-then-data interface xr.open_dataset(stac_asset, ...), whereas intake-stac would provide the data-then-function interface stac_asset.to_xarray_dataset(), xref stac-utils/pystac#846 (comment)?

Yes! And I haven't really documented this, but to_xarray is exposed as part of this library so ideally intake-stac can use that in its implementation.

jsignell pushed a commit that referenced this issue Mar 29, 2023
Update the kwargs merging logic to have cascading priority where 
`default_kwargs` is overridden by `open_kwargs` which is overridden
by user provided `kwargs` .
Includes a regression unit test that extends the existing simple_zarr
test to ensure that the fix for duplicate keys works.

Note that the `xarray:open_kwargs` field is a part of the
[`xarray-assets`](https://github.com/stac-extensions/xarray-assets/tree/v1.0.0)
STAC extension (in Proposal stage), xref #16

Fixes #17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants