|
| 1 | +# Get started with the openEO Python Client Client Side Processing |
| 2 | + |
| 3 | +## Background |
| 4 | + |
| 5 | +The client-side processing functionality allows to test and use openEO with its processes locally, i.e. without any connection to an openEO back-end. |
| 6 | +It relies on the projects [openeo-pg-parser-networkx](https://github.com/Open-EO/openeo-pg-parser-networkx>), which provides an openEO process graph parsing tool, and [openeo-processes-dask](https://github.com/Open-EO/openeo-processes-dask), which provides an Xarray and Dask implementation of most openEO processes. |
| 7 | + |
| 8 | +## Installation |
| 9 | + |
| 10 | +::: danger Important |
| 11 | + |
| 12 | +This feature requires ``Python>=3.9``. |
| 13 | + |
| 14 | +::: |
| 15 | + |
| 16 | +The openEO Python client library can easily be installed with a tool like `pip`, for example: |
| 17 | + |
| 18 | +```shell script |
| 19 | +pip install openeo[localprocessing] |
| 20 | +``` |
| 21 | + |
| 22 | + |
| 23 | +## Usage |
| 24 | + |
| 25 | +Every openEO process graph relies on data which is typically provided by a cloud infrastructure (the openEO back-end). |
| 26 | +The client-side processing adds the possibility to read and use local netCDFs, geoTIFFs, ZARR files, and remote STAC Collections or Items for your experiments. |
| 27 | + |
| 28 | +### STAC Collections and Items |
| 29 | +::: danger Important |
| 30 | + |
| 31 | +The provided examples using STAC rely on third party STAC Catalogs, we can't guarantee that the urls will remain valid. |
| 32 | + |
| 33 | +::: |
| 34 | + |
| 35 | +With the `load_stac` process it's possible to load and use data provided by remote or local STAC Collections or Items. |
| 36 | +The following code snippet loads Sentinel-2 L2A data from a public STAC Catalog, using specific spatial and temporal extent, band name and also properties for cloud coverage. |
| 37 | + |
| 38 | +```python |
| 39 | +from openeo.local import LocalConnection |
| 40 | +local_conn = LocalConnection("./") |
| 41 | + |
| 42 | +url = "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a" |
| 43 | +spatial_extent = {"west": 11, "east": 12, "south": 46, "north": 47} |
| 44 | +temporal_extent = ["2019-01-01", "2019-06-15"] |
| 45 | +bands = ["red"] |
| 46 | +properties = {"eo:cloud_cover": dict(lt=50)} |
| 47 | +s2_cube = local_conn.load_stac(url=url, |
| 48 | + spatial_extent=spatial_extent, |
| 49 | + temporal_extent=temporal_extent, |
| 50 | + bands=bands, |
| 51 | + properties=properties, |
| 52 | +) |
| 53 | +s2_cube.execute() |
| 54 | +``` |
| 55 | + |
| 56 | +When calling the `.execute()` method on a `Datacube` created from a `LocalConnection`, an `xarray.DataArray` object containing dask arrays is returned: |
| 57 | + |
| 58 | +``` |
| 59 | + >>> s2_cube.execute() |
| 60 | + <xarray.DataArray 'stackstac-08730b1b5458a4ed34edeee60ac79254' (time: 177, |
| 61 | + band: 1, |
| 62 | + y: 11354, |
| 63 | + x: 8025)> |
| 64 | + dask.array<getitem, shape=(177, 1, 11354, 8025), dtype=float64, chunksize=(1, 1, 1024, 1024), chunktype=numpy.ndarray> |
| 65 | + Coordinates: (12/53) |
| 66 | + * time (time) datetime64[ns] 2019-01-02... |
| 67 | + id (time) <U24 'S2B_32TPR_20190102_... |
| 68 | + * band (band) <U3 'red' |
| 69 | + * x (x) float64 6.52e+05 ... 7.323e+05 |
| 70 | + * y (y) float64 5.21e+06 ... 5.096e+06 |
| 71 | + s2:product_uri (time) <U65 'S2B_MSIL2A_20190102... |
| 72 | + ... ... |
| 73 | + raster:bands object {'nodata': 0, 'data_type'... |
| 74 | + gsd int32 10 |
| 75 | + common_name <U3 'red' |
| 76 | + center_wavelength float64 0.665 |
| 77 | + full_width_half_max float64 0.038 |
| 78 | + epsg int32 32632 |
| 79 | + Attributes: |
| 80 | + spec: RasterSpec(epsg=32632, bounds=(600000.0, 4990200.0, 809760.0... |
| 81 | + crs: epsg:32632 |
| 82 | + transform: | 10.00, 0.00, 600000.00|\n| 0.00,-10.00, 5300040.00|\n| 0.0... |
| 83 | + resolution: 10.0 |
| 84 | +``` |
| 85 | + |
| 86 | +### Local Collections |
| 87 | + |
| 88 | +If you want to use our sample data, please clone this repository: |
| 89 | + |
| 90 | +```bash |
| 91 | +git clone https://github.com/Open-EO/openeo-localprocessing-data.git |
| 92 | +``` |
| 93 | + |
| 94 | +With some sample data we can now check the STAC metadata for the local files by doing: |
| 95 | + |
| 96 | +```python |
| 97 | +from openeo.local import LocalConnection |
| 98 | +local_data_folders = [ |
| 99 | + "./openeo-localprocessing-data/sample_netcdf", |
| 100 | + "./openeo-localprocessing-data/sample_geotiff", |
| 101 | +] |
| 102 | +local_conn = LocalConnection(local_data_folders) |
| 103 | +local_conn.list_collections() |
| 104 | +``` |
| 105 | +This code will parse the metadata content of each netCDF, geoTIFF or ZARR file in the provided folders and return a JSON object containing the STAC representation of the metadata. |
| 106 | +If this code is run in a Jupyter Notebook, the metadata will be rendered nicely. |
| 107 | + |
| 108 | +.. tip:: |
| 109 | + The code expects local files to have a similar structure to the sample files provided [here](https://github.com/Open-EO/openeo-localprocessing-data.git). |
| 110 | + If the code can not handle you special netCDF, you can still modify the function that reads the metadata from it [here](https://github.com/Open-EO/openeo-python-client/blob/master/openeo/local/collections.py) and the function that reads the data [here](https://github.com/Open-EO/openeo-python-client/blob/master/openeo/local/processing.py). |
| 111 | + |
| 112 | +### Local Processing |
| 113 | + |
| 114 | +Let's start with the provided sample netCDF of Sentinel-2 data: |
| 115 | +```python |
| 116 | +local_collection = "openeo-localprocessing-data/sample_netcdf/S2_L2A_sample.nc" |
| 117 | +s2_datacube = local_conn.load_collection(local_collection) |
| 118 | +``` |
| 119 | +``` |
| 120 | +>>> # Check if the data is loaded correctly |
| 121 | +>>> s2_datacube.execute() |
| 122 | +<xarray.DataArray (bands: 5, t: 12, y: 705, x: 935)> |
| 123 | +dask.array<stack, shape=(5, 12, 705, 935), dtype=float32, chunksize=(1, 12, 705, 935), chunktype=numpy.ndarray> |
| 124 | +Coordinates: |
| 125 | + * t (t) datetime64[ns] 2022-06-02 2022-06-05 ... 2022-06-27 2022-06-30 |
| 126 | + * x (x) float64 6.75e+05 6.75e+05 6.75e+05 ... 6.843e+05 6.843e+05 |
| 127 | + * y (y) float64 5.155e+06 5.155e+06 5.155e+06 ... 5.148e+06 5.148e+06 |
| 128 | + crs |S1 ... |
| 129 | + * bands (bands) object 'B04' 'B03' 'B02' 'B08' 'SCL' |
| 130 | +Attributes: |
| 131 | + Conventions: CF-1.9 |
| 132 | + institution: openEO platform - Geotrellis backend: 0.9.5a1 |
| 133 | + description: |
| 134 | + title: |
| 135 | +``` |
| 136 | + |
| 137 | +As you can see in the previous example, we are using a call to `.execute()` which will execute locally the generated openEO process graph. |
| 138 | +In this case, the process graph consist only in a single `load_collection`, which performs lazy loading of the data. With this first step you can check if the data is being read correctly by openEO. |
| 139 | + |
| 140 | +Looking at the metadata of this netCDF sample, we can see that it contains the bands B04, B03, B02, B08 and SCL. |
| 141 | +Additionally, we also see that it is composed by more than one element in time and that it covers the month of June 2022. |
| 142 | + |
| 143 | +We can now do a simple processing for demo purposes, let's compute the median NDVI in time and visualize the result: |
| 144 | + |
| 145 | +```python |
| 146 | +b04 = s2_datacube.band("B04") |
| 147 | +b08 = s2_datacube.band("B08") |
| 148 | +ndvi = (b08 - b04) / (b08 + b04) |
| 149 | +ndvi_median = ndvi.reduce_dimension(dimension="t", reducer="median") |
| 150 | +result_ndvi = ndvi_median.execute() |
| 151 | +result_ndvi.plot.imshow(cmap="Greens") |
| 152 | +``` |
| 153 | + |
| 154 | +We can perform the same example using data provided by STAC Collection: |
| 155 | + |
| 156 | + |
| 157 | +```python |
| 158 | +from openeo.local import LocalConnection |
| 159 | +local_conn = LocalConnection("./") |
| 160 | + |
| 161 | +url = "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a" |
| 162 | +spatial_extent = {"east": 11.40, "north": 46.52, "south": 46.46, "west": 11.25} |
| 163 | +temporal_extent = ["2022-06-01", "2022-06-30"] |
| 164 | +bands = ["red", "nir"] |
| 165 | +properties = {"eo:cloud_cover": dict(lt=80)} |
| 166 | +s2_datacube = local_conn.load_stac( |
| 167 | + url=url, |
| 168 | + spatial_extent=spatial_extent, |
| 169 | + temporal_extent=temporal_extent, |
| 170 | + bands=bands, |
| 171 | + properties=properties, |
| 172 | +) |
| 173 | + |
| 174 | +b04 = s2_datacube.band("red") |
| 175 | +b08 = s2_datacube.band("nir") |
| 176 | +ndvi = (b08 - b04) / (b08 + b04) |
| 177 | +ndvi_median = ndvi.reduce_dimension(dimension="time", reducer="median") |
| 178 | +result_ndvi = ndvi_median.execute() |
| 179 | +``` |
| 180 | + |
| 181 | +## Client-Side Processing Example Notebooks |
| 182 | +* [From the openEO Python Client repo](https://github.com/Open-EO/openeo-python-client/tree/master/examples/notebooks/Client_Side_Processing) |
| 183 | +* [From the Cubes and Clouds repo](https://github.com/EO-College/cubes-and-clouds/blob/main/lectures/3.1_data_processing/exercises/_alternatives/31_data_processing_stac.ipynb) |
| 184 | + |
| 185 | +## Additional Information |
| 186 | + |
| 187 | +Additional information and resources about the openEO Python Client Library: |
| 188 | + |
| 189 | +* [Official openEO Python Client Library Documentation](https://open-eo.github.io/openeo-python-client/) |
| 190 | +* [Official openeo.cloud sample notebooks](https://github.com/openEOPlatform/sample-notebooks) |
| 191 | +* [Example Python scripts](https://github.com/Open-EO/openeo-python-client/tree/master/examples) |
| 192 | +* [Example Jupyter Notebooks](https://github.com/Open-EO/openeo-python-client/tree/master/examples/notebooks) |
| 193 | +* [Repository on GitHub](https://github.com/Open-EO/openeo-python-client) |
| 194 | +* [Run openEO processes in a Python Shiny App](./shiny.md) |
0 commit comments