-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement open_data and support tiff, netcdf and zarr for s3 and https protocol; Implement stack-mode which stacks items containing tiffs; #19
Conversation
start pr draft for documentation
work in progress notebooks work again backup unittest adjusted error in dependencies test re-recorded cassettes for test_open_data_abfs test adjusted xcube server STAC example to open geotiff and levels mldataset now supported data_type implemented simplified data opener
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good! I have actually no serious complaints. Almost all my comments and suggestions regard style & design, only the one regarding the store ctor is mandatory.
You should add a coverage badge.
environment_workflow.yml
Outdated
dependencies: | ||
# Python | ||
- python >=3.9 | ||
# Required xcube |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Required xcube | |
# Required by xcube |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
environment_workflow.yml
Outdated
- urllib3 >=1.26 | ||
- xarray >=2022.6 | ||
- zarr >=2.11 | ||
# Required xcube-stac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Required xcube-stac | |
# Required by xcube-stac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
xcube_stac/utils.py
Outdated
from .constants import DATA_OPENER_IDS, FloatInt, MAP_MIME_TYP_FORMAT | ||
|
||
|
||
def _get_assets_from_item( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand you are using underscore prefixes to indicate that the functions in this module are implementation details and not API.
However, underscored variables, functions, and classes should not be imported. Even by own modules (exceptions exists, e.g. testing). You can underscore, by renaming utils.py
into _utils.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
xcube_stac/mldataset.py
Outdated
self._datasets.append(_apply_scaling_nodata(ds, self._items)) | ||
|
||
def _get_num_levels_lazily(self) -> int: | ||
return len(self._datasets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return len(self._datasets) | |
return len(self._resolutions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
return len(self._datasets) | ||
|
||
def _get_dataset_lazily(self, index: int, parameters) -> xr.Dataset: | ||
return self._datasets[index] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return self._datasets[index] | |
ds = odc.stac.load( | |
self._items, | |
resolution=self._resolutions[index], | |
**self._open_params, | |
) | |
return _apply_scaling_nodata(ds, self._items) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see below.
xcube_stac/impl.py
Outdated
} | ||
|
||
|
||
class SingleImplementation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single Implementation of what? I recommend finding a better name of what is implemented here, for example, the Store Mode. Then StoreMode
would describe the purely abstract interface, and SingleStoreMode
represents its implementation for the mode Single.
class SingleImplementation: | |
class SingleStoreMode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the name to SingleStoreMode
. To be consistent I also changed to name from StackImplementation
to StackStoreMode
.
xcube_stac/impl.py
Outdated
f"Only 's3' and 'https' protocols are supported, not {protocol!r}. " | ||
f"The asset {asset.extra_fields['id']!r} has a href {asset.href!r}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"Only 's3' and 'https' protocols are supported, not {protocol!r}. " | |
f"The asset {asset.extra_fields['id']!r} has a href {asset.href!r}." | |
f"Only 's3' and 'https' protocols are supported, not {protocol!r}." | |
f" The asset {asset.extra_fields['id']!r} has a href {asset.href!r}." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
xcube_stac/impl.py
Outdated
else: | ||
if not self._s3_accessor.root == root: | ||
LOG.debug( | ||
f"The bucket {self._s3_accessor.root!r} of the " | ||
f"S3 object storage changed to {root!r}. " | ||
"A new s3 data opener will be initialized." | ||
) | ||
self._s3_accessor = S3DataAccessor( | ||
root, storage_options=storage_options | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else: | |
if not self._s3_accessor.root == root: | |
LOG.debug( | |
f"The bucket {self._s3_accessor.root!r} of the " | |
f"S3 object storage changed to {root!r}. " | |
"A new s3 data opener will be initialized." | |
) | |
self._s3_accessor = S3DataAccessor( | |
root, storage_options=storage_options | |
) | |
elif self._s3_accessor.root != root: | |
LOG.debug( | |
f"The bucket {self._s3_accessor.root!r} of the " | |
f"S3 object storage changed to {root!r}. " | |
"A new s3 data opener will be initialized." | |
) | |
self._s3_accessor = S3DataAccessor( | |
root, storage_options=storage_options | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
xcube_stac/impl.py
Outdated
else: | ||
if not self._https_accessor.root == root: | ||
LOG.debug( | ||
f"The root {self._https_accessor.root!r} of the " | ||
f"https data opener changed to {root!r}. " | ||
"A new https data opener will be initialized." | ||
) | ||
self._https_accessor = HttpsDataAccessor(root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else: | |
if not self._https_accessor.root == root: | |
LOG.debug( | |
f"The root {self._https_accessor.root!r} of the " | |
f"https data opener changed to {root!r}. " | |
"A new https data opener will be initialized." | |
) | |
self._https_accessor = HttpsDataAccessor(root) | |
elif self._https_accessor.root != root: | |
LOG.debug( | |
f"The root {self._https_accessor.root!r} of the " | |
f"https data opener changed to {root!r}. " | |
"A new https data opener will be initialized." | |
) | |
self._https_accessor = HttpsDataAccessor(root) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
elif stack_mode is True or stack_mode == "odc-stac": | ||
self._impl = StackImplementation( | ||
self._catalog, self._url_mod, self._searchable, self._storage_options_s3 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Else? Since it is possible to instantiate the store via ctor, you should raise a ValueError()
in the else-case. Otherwise _impl
will not be initialised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data store parameters are validated against the schema in the xcube data store framework. the schema is define in constants.py and this is tested in (test_store.py)[https://github.com/xcube-dev/xcube-stac/blob/konstntokas-xxx-implement_stac_mode/test/test_store.py#L90].
However, I agree, if a user would initialize a stac data store directly by calling the class StacDataStore
, then this is not covered. I therefore added a else case.
Let's address this separately. |
I understood the problem better, and it is on the xcube-stac side. I agree addressing it seperately. It is documented in #20. |
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
Coverage badge is added. See https://app.codecov.io/gh/xcube-dev/xcube-stac/tree/konstntokas-xxx-implement_stac_mode. I need to double check the links once it is merged. |
Closes #3, #8
1. Implement open_data and support tiff, netcdf and zarr for s3 and https protocol
I suggest to start with the notebooks in the folder examples as an introduction.
Here, I encountered a problem with the levels file in the server demo in xcube. When I try to open the levels file, I get a mldataset instance back. Wen I then try to open one level, an error occurs in xcube_server_stac_s3.ipynb cell 12. I suspect that when starting the server, the
.levels
file with Identifierlocal
is used to calculatelocal_1w
in resample_in_time.py, which adds<dim>_bnds
data variables to thexr.Dataset
, which in turn brings problems in levels. Any idea to mitigate that problem without changing the xcube server demo example?Further I changes the github action workflow so that the the current xcube repo is checked out. Furthermore, I run the xcube server to get a higher test coverage. The unit tests work with decorators and are only run if the server is running. So for the later development, the server can be easily turned of, if needed.
Filtering the asset is not implemented yet. It is documented in Issue #18. I see this as a nice add-on for later. Also the stac-spec of an asset requires only href.
2. Implement stack-mode which stacks items containing tiffs
The data store parameter <stack_mode> is added. If True, the collection ID becomes the <data_id>. Items will selected with matches a bounding box and time range. These items will be stacked using odc-stac. So far only tiffs are supported. Reading of data in odc-stac relies on rasterio, which is based on GDAL. It should support formats, if a GDAL driver exists. Examples of other formats will be added in the future.