-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open PID with Xarray #139
Comments
Could you please explain what a PID is, and how you map it to the actual asset/file? |
A PID is a Persistent identifier, which is a long lasting reference to a digital object (https://en.wikipedia.org/wiki/Persistent_identifier). PIDs can be resolved via a handle server: In the example above the request would look like this: The json response has an entry with the file location:
This file can be downloaded or opened directly with xarray. An example of this workflow can be found in the following notebook: https://gitlab.dkrz.de/data-infrastructure-services/fdo/-/blob/master/automated_data_access_improved.ipynb?ref_type=heads PIDs for this kind of climate data (CMIP6, https://en.wikipedia.org/wiki/Coupled_Model_Intercomparison_Project) are standardized with always the same keywords. My question would be, is it possible to implement a function that allows xarray to open a file by simply passing its PID? |
Interesting! I see that the HDL server also knows about the "dataset" that this is part of (which links, in turn, to a DOI).
Certainly. It would be easy to add to intake-xarray, but I would like to add it to add it to Intake Take2, as this process "transform URL of known form to other URL of known type" is just the kind of thing it's designed for. Question: |
Was this closed in error? With scratch code in Intake 2, I have
|
Yes, it has been closed in error.
Perfect, this looks like the workflow I imagined. |
That is actually a good question. The example provided was a single file. However, we also have dataset PIDs, e.g. |
Using the HAS_PARTS value? |
What exactly do you mean? |
OK, this class implements it for V2, although some questions remain. It could also be included in this repo for V1. |
Thanks a lot for implementing it. :-) |
There are some comments in the code. It's a little awkward to return data instances, which you then have to do something with; so maybe it would be better to return Xarray readers or even the final xarray instances. |
That is a valid point. Anyway, it is a start! |
Is it possible to implement a feature, which enables intake-xarray to open a file based on its PID?
For example:
In this example
hdl:21.14100/02c6b729-fff6-4f31-a8da-2cf590b544df
is a PID handle of a CMIP6 precipitation data set.The text was updated successfully, but these errors were encountered: