Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan the component #1

Closed
mdpiper opened this issue Mar 31, 2021 · 9 comments
Closed

Plan the component #1

mdpiper opened this issue Mar 31, 2021 · 9 comments

Comments

@mdpiper
Copy link
Member

mdpiper commented Mar 31, 2021

I'd like to make the planning and development of this project as open and as public as possible. GitHub provides a good place for this.

Goal

Make a CSDMS data component that allows a user to read data (and metadata) from a GeoTIFF file. I estimate this project will take about a week to complete.

Overview

  • Start with a GeoTIFF file on the local filesystem. This component won't fetch data from the internets. Update: No--an URL to a remote file can also be used.
  • Use xarray.open_rasterio to load the data and metadata from the file into memory. This implies raster data only.
  • No CLI, only API and BMI.

Outline

  • Package name: bmi_geotiff
  • Module names: io.py, bmi.py
  • Write the library and tests iteratively
  • Write the BMI and run bmi-tester iteratively
  • Make an example, in two parts:
    1. a script that can be run at the shell prompt (just API, save BMI for pymt implementation)
    2. a Jupyter Notebook based on the script
  • README
    • include install instructions
    • include the script example
  • Documentation
    • include README
    • include Jupyter Notebook example converted to rst
    • deploy to Read the Docs
  • CI with GitHub Actions
  • Stamp release using zest.releaser
  • TestPyPI
  • PyPI
  • Make a conda-forge package
  • DOI
  • Babelize
  • Make a pymt package
  • Add to CSDMS Model Repository (this will populate Data Components page)
  • Twitter announcement
@mdpiper mdpiper mentioned this issue Mar 31, 2021
20 tasks
@mdpiper
Copy link
Member Author

mdpiper commented Mar 31, 2021

I want to try using [options.package_data] instead of a MANIFEST.in file. See the xarray setup.cfg for a working example.

Update: I was mistaken. What I really want to use is setuptools-scm.

@mdpiper
Copy link
Member Author

mdpiper commented Mar 31, 2021

Call the library module io.py instead of geotiff.py.

@mdpiper
Copy link
Member Author

mdpiper commented Apr 1, 2021

A GeoTIFF typically stores raster imagery. The data could be many things: elevations, reflectances, intensities, etc. What should the Standard Name for these data be? Candidates:

  • image__raster_data (image__aspect_ratio is in the registry)
  • gis_raster__data (could also have gis_vector__data)

It would also be good to have a Standard Name for the projection string, although this will be obsolete with csdms/bmi#80. Candidates:

  • map_projection__string
  • coordinate_reference_system__descriptor

@mdpiper
Copy link
Member Author

mdpiper commented Apr 1, 2021

The xarray.openrasterio method automatically returns rectilinear data, so the BMI grid type can be rectilinear.

@mdpiper
Copy link
Member Author

mdpiper commented Apr 1, 2021

A file could hold a time series of images, although this isn't typical. I need to think about how to handle this. Is there a standard?

This is another use case for an is_steady_state function in the BMI (see csdms/bmi#79) because this would change the way we interpret the rank of the data and the way we use get_value.

Idea: a boolean is_time_series entry in the BMI config file and a _time attribute in BmiGeoTiff.

For now, neglect time information in the BMI.

@mdpiper
Copy link
Member Author

mdpiper commented Apr 23, 2021

When setting up the conda-forge recipe, try using PyPI for the source tarball instead of GitHub.

@mdpiper
Copy link
Member Author

mdpiper commented Apr 23, 2021

Should I use rioxarray instead of xarray? rioxarray does reprojection. It wouldn't hard to make the switch. See http://xarray.pydata.org/en/stable/io.html#rasterio

Should I use rasterio instead of xarray?

@mdpiper
Copy link
Member Author

mdpiper commented Apr 23, 2021

Idea: The BMI should always address single bands. Never load multiband data (even RGB) all at once. Use time and the update and update_until methods.

On the other hand, xarray does a lazy load of data from disk, so it may do no harm in keeping all bands.

Can I update_until a negative time?

@mdpiper mdpiper pinned this issue Nov 1, 2021
@mdpiper mdpiper changed the title Planning the component Plan the component Nov 1, 2021
@mdpiper mdpiper closed this as completed Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant