-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #15 from xcube-dev/konstntokas-013-change_data_id_…
…url_part Change data IDs and make the STAC data store searchable
- Loading branch information
Showing
29 changed files
with
76,086 additions
and
5,896 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,339 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Non-seachable STAC catalog\n", | ||
"\n", | ||
"This notebook shows an example how to access items in a non-searchable STAC catalog, which does not implement the [STAC API - Item Search](https://github.com/radiantearth/stac-api-spec/tree/release/v1.0.0/item-search) conformance class. When searching in such type of catalog, the catalog needs to be crawled through and the items properties needs to be matched to the search parameters. This process will there be slow, especially for large catalogs.\n", | ||
"\n", | ||
"### Setup\n", | ||
"In order to run this notebook you need to install [`xcube`](https://xcube.readthedocs.io/en/latest/) and the [`xcube_stac`](https://github.com/xcube-dev/xcube-stac) plugin. You may install [`xcube_stac`](https://github.com/xcube-dev/xcube-stac) directly from the git repository by cloning the repository, directing into `xcube-stac`, and following the steps below:\n", | ||
"\n", | ||
"```bash\n", | ||
"conda env create -f environment.yml\n", | ||
"conda activate xcube-stac\n", | ||
"pip install .\n", | ||
"```\n", | ||
"\n", | ||
"Note that [`xcube`](https://xcube.readthedocs.io/en/latest/) is included in the `environment.yml`. \n", | ||
"\n", | ||
"Now, we first import everything we need:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from xcube.core.store import new_data_store, get_data_store_params_schema\n", | ||
"import itertools" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"First, we get the store parameters needed to initialize a STAC [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework). " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We determine the url of the [EcoDataCube.eu](https://stac.ecodatacube.eu/) STAC catalog and initiate a STAC [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework) where the `xcube-stac` plugin is recognized by setting the first argument to `\"stac\"` in the `new_data_store` function." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stderr", | ||
"output_type": "stream", | ||
"text": [ | ||
"/home/konstantin/micromamba/envs/xcube-stac/lib/python3.12/site-packages/pystac_client/client.py:190: NoConformsTo: Server does not advertise any conformance classes.\n", | ||
" warnings.warn(NoConformsTo())\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"url = \"https://s3.eu-central-1.wasabisys.com/stac/odse/catalog.json\"\n", | ||
"store = new_data_store(\"stac\", url=url)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"The data IDs point to a [STAC item's JSON](https://github.com/radiantearth/stac-spec/blob/master/item-spec/item-spec.md) and are specified by the segment of the URL that follows the catalog's URL. The data IDs can be streamed using the following code where we show the first 10 data IDs as an example." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"['lcv_land.mask_eumap/lcv_land.mask_eumap_2014.01.01..2016.12.31/lcv_land.mask_eumap_2014.01.01..2016.12.31.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_1999.12.02..2000.03.20/lcv_blue_landsat.glad.ard_1999.12.02..2000.03.20.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.03.21..2000.06.24/lcv_blue_landsat.glad.ard_2000.03.21..2000.06.24.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.06.25..2000.09.12/lcv_blue_landsat.glad.ard_2000.06.25..2000.09.12.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.09.13..2000.12.01/lcv_blue_landsat.glad.ard_2000.09.13..2000.12.01.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.12.02..2001.03.20/lcv_blue_landsat.glad.ard_2000.12.02..2001.03.20.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2001.03.21..2001.06.24/lcv_blue_landsat.glad.ard_2001.03.21..2001.06.24.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2001.06.25..2001.09.12/lcv_blue_landsat.glad.ard_2001.06.25..2001.09.12.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2001.09.13..2001.12.01/lcv_blue_landsat.glad.ard_2001.09.13..2001.12.01.json',\n", | ||
" 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2001.12.02..2002.03.20/lcv_blue_landsat.glad.ard_2001.12.02..2002.03.20.json']" | ||
] | ||
}, | ||
"execution_count": 3, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"data_ids = store.get_data_ids()\n", | ||
"list(itertools.islice(data_ids, 10))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In the next step, we can search for items using search parameters. The following code shows which search parameters are available." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"application/json": { | ||
"additionalProperties": false, | ||
"properties": { | ||
"bbox": { | ||
"items": [ | ||
{ | ||
"type": "number" | ||
}, | ||
{ | ||
"type": "number" | ||
}, | ||
{ | ||
"type": "number" | ||
}, | ||
{ | ||
"type": "number" | ||
} | ||
], | ||
"title": "Bounding box [x1,y1,x2,y2] in geographical coordinates", | ||
"type": "array" | ||
}, | ||
"collections": { | ||
"description": "Collection IDs to be included in the search request.", | ||
"items": { | ||
"minLength": 0, | ||
"type": "string" | ||
}, | ||
"title": "Collection IDs", | ||
"type": "array", | ||
"uniqueItems": true | ||
}, | ||
"time_range": { | ||
"description": "Time range given as pair of start and stop dates. Dates must be given using format 'YYYY-MM-DD'. Start and stop are inclusive.", | ||
"items": [ | ||
{ | ||
"format": "date", | ||
"type": [ | ||
"string", | ||
"null" | ||
] | ||
}, | ||
{ | ||
"format": "date", | ||
"type": [ | ||
"string", | ||
"null" | ||
] | ||
} | ||
], | ||
"title": "Time Range", | ||
"type": "array" | ||
} | ||
}, | ||
"type": "object" | ||
}, | ||
"text/plain": [ | ||
"<xcube.util.jsonschema.JsonObjectSchema at 0x71d3deb4b2f0>" | ||
] | ||
}, | ||
"execution_count": 4, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"search_params = store.get_search_params_schema()\n", | ||
"search_params" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Now, let's search for Landsat Thematic Mapper data for the European region during the first quarter of 2000." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"[{'data_id': 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_1999.12.02..2000.03.20/lcv_blue_landsat.glad.ard_1999.12.02..2000.03.20.json',\n", | ||
" 'data_type': 'dataset',\n", | ||
" 'bbox': [-23.550818268711048,\n", | ||
" 24.399543432891665,\n", | ||
" 63.352379098951936,\n", | ||
" 77.69295185585888],\n", | ||
" 'time_range': ['1999-12-02', '2000-03-20']},\n", | ||
" {'data_id': 'lcv_blue_landsat.glad.ard/lcv_blue_landsat.glad.ard_2000.03.21..2000.06.24/lcv_blue_landsat.glad.ard_2000.03.21..2000.06.24.json',\n", | ||
" 'data_type': 'dataset',\n", | ||
" 'bbox': [-23.550818268711048,\n", | ||
" 24.399543432891665,\n", | ||
" 63.352379098951936,\n", | ||
" 77.69295185585888],\n", | ||
" 'time_range': ['2000-03-21', '2000-06-24']}]" | ||
] | ||
}, | ||
"execution_count": 5, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"descriptors = list(store.search_data(\n", | ||
" collections=[\"lcv_blue_landsat.glad.ard\"],\n", | ||
" bbox=[-10, 40, 40, 70],\n", | ||
" time_range=[\"2000-01-01\", \"2000-04-01\"]\n", | ||
"))\n", | ||
"[d.to_dict() for d in descriptors]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"In the next step, we can open the data for each data ID. (Note that this is not fully implemented yet. So far we can access assets which will give the href to the data resource). The following code shows which parameters are available for opening the data." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"application/json": { | ||
"additionalProperties": false, | ||
"properties": { | ||
"asset_names": { | ||
"description": "Names of assets which will be included in the data cube.", | ||
"items": { | ||
"minLength": 0, | ||
"type": "string" | ||
}, | ||
"title": "Names of assets", | ||
"type": "array", | ||
"uniqueItems": true | ||
} | ||
}, | ||
"type": "object" | ||
}, | ||
"text/plain": [ | ||
"<xcube.util.jsonschema.JsonObjectSchema at 0x71d31f6732f0>" | ||
] | ||
}, | ||
"execution_count": 6, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"open_params = store.get_open_data_params_schema()\n", | ||
"open_params" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"We select the Band 1 (blue) and get the corresponding assets and the corresponding hrefs pointing to the data resources by running the following code." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"['https://s3.eu-central-1.wasabisys.com/eumap/lcv/lcv_blue_landsat.glad.ard_p50_30m_0..0cm_1999.12.02..2000.03.20_eumap_epsg3035_v1.1.tif',\n", | ||
" 'https://s3.eu-central-1.wasabisys.com/eumap/lcv/lcv_blue_landsat.glad.ard_p50_30m_0..0cm_2000.03.21..2000.06.24_eumap_epsg3035_v1.1.tif']" | ||
] | ||
}, | ||
"execution_count": 7, | ||
"metadata": {}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"asset_collection = []\n", | ||
"for descriptor in descriptors:\n", | ||
" assets = store.open_data(descriptor.data_id, asset_names=[\"blue_p50\"])\n", | ||
" assert len(assets) == 1\n", | ||
" asset_collection.append(assets[0])\n", | ||
"[asset.href for asset in asset_collection]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"This notebook will be continued once the data access is implemented." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.12.3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
Oops, something went wrong.