Skip to content
David LeBauer edited this page Jan 9, 2018 · 26 revisions

Data products

see notes on status of data product reviews https://docs.google.com/document/d/1LqIVYYlU47Q0H1R_F0hhAm3xlydRzlHpH8QjlfP-K24/edit s/356

  • Need to purge deprecated datasets
  • Migrate Globus from ROGER to Nebula

PSII

need equations to compute traits

https://github.com/terraref/computing-pipeline/issues/216

Texture

  • region of interest
  • needs to work on raw images + scale to plot (like other image analyses)

Multispectral NDVI/PRI

aggregate points and compute plot level means

RGB stereotop

clip plot fragments and combine

https://github.com/terraref/computing-pipeline/issues/356

changes in position w/ temperature

need to add T correction defined here: https://github.com/terraref/reference-data/issues/161

Environmental Logger

https://github.com/terraref/reference-data/issues/179

Geostreams

  • exclude wavelength from geostreams dataset
  • improve irrigation (perhaps per-plot)

Laser Scanner

  • georeferenced or referenced to gantry xyz ply product
  • leaf angle high priority
  • height distribution: add skewness and kurtosis

netcdf

  • we may not need to keep variables in both geostreams and netcdf. netcdf can be used specifically for solar informtion
  • redundant variables: raw_ and sensor_ values (e.g. of par, co2, etc) are redundant with variables that have names like 'Atmospheric_CO2_concentration'.
    • Remove variables beginning in raw_ from the environmental logger netcdf files
    • variables beginning in sensor_ have no data
    • use consistent standard names

to subset out just the spectral radiometer and other values needed for the hyperspectral workflow, something like

ncks -v time -v flx_dwn -v flx_sns -v flx_spc_dwn -v wvl_lgr -v wvl_dlt \
          Level_1/EnvironmentLogger/2017-08-12/EnvironmentLogger_lv1_2017-08-12_uamac.nc  #outfile.nc?

?? what variables are needed for hyperspectral workflow?

Lemnatec Metadata

https://github.com/terraref/reference-data/issues/176

  • use controlled vocabularies, e.g. as described in https://github.com/terraref/sensor-metadata/blob/master/README.md
    • provide dictionary for metadata fields
  • add url link to managements from experiment_metadata field (??) similar to select * from managements where treatment_id in (select id from treatments join traits join sites join experiments where experiments.name = ....)

RGB Masks

Might be v. simple - should export plot-level black/white masks from the canopy cover extractor.

https://github.com/terraref/computing-pipeline/issues/376

Clipping and analyzing pre-stitched level 1 products, aggregating to plot

https://github.com/terraref/computing-pipeline/issues/356

User Interfaces

Clowder

  • all sensor metadata should be publicly acessible (this was a bug?)
  • per-user, read-only API keys
  • enable search interface

QA/QC

Search

cross-platform search

Protocols

Testing

Wei Qin to develop protocol / testing framework with feedback from Zongyang, Rob Kooper, Craig, others

Submitting an extractor

  • write protocol for submitting an extractor that includes:
    • Pull requests
    • Tests
    • Protocol
    • quality statement

Revising existing extractors

  • need to make sure we have done code reviews, have tests, protocols, quality statements for extractors above

  • (Proposed) Format for READMEs

  • Extractor name:

  • Date: date documentation was created

  • Author: documentation and extractor author(s)

  • Extractor Description: One or two sentence english description of the extractor purpose

  • Inputs and Outputs: Specific definition of the input and output of the extractor

  • Algorithm description: Long form description of the algorithm

  • Parameters: If there are parameters that may need changing for future

  • Failure Conditions: Known situations where the extractor might fail.

Field methods

  • how to use / store in BETYdb
  • how to archive / assign doi, share etc

Gantry

Uncertainty quantification:

implement methods following https://docs.google.com/document/d/1hWqkowvopYqGkeckSWg-_JzN3-DIS36rCBCFI09Sqyk/edit also search for related issues

Architecture

Latency issue

computing on nebula w/ data on roger is slow https://github.com/terraref/computing-pipeline/issues/368

Extractor Changes

  • mount specific directories rw on an as needed basis