Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

TODO forman

Norman Fomferra edited this page Sep 28, 2016 · 18 revisions

This page lists any ECT ideas, tasks, issues that come into my mind regarding the upcoming ECT release. Some of them can already be found in the ECT Issue Tracker while others are temporary tasks that don't need a representation as a publicly visible issue.

Prioritization:

  • B - blocker: no release without this
  • H - high: severe loss of functionality if not considered
  • M - medium: if not now, maybe next release
  • L - low: nice to have

API requirements

Open

  • H: Write Op development guide including descriptions of the various properties to be used
  • H: milestone v2: Model image outputs for CLI (write image) and WebAPI (start image tile service)
  • M: Workflow: invoke() --> execute()
  • M: remove mist of unnecessary @op_input
  • M: io: find_reader/find_writer: if no file format given: extract ext and find all formats with that ext. Then continue as if format was given.
  • M: Op: skip operation of type Class. Rather, allow for registration of bound methods (instance functions)
  • M: Op: derive additional operation header info: file, module, version, etc
  • L: Op: support positional arguments

Done

  • B: Use NodePort.has_value when converting workflow to JSON (blocker because executed workspaces may include non-serializable values)
  • H: Ops: make sure all ops use "simple" values which can be converted from/to text or JSON. The @op_input decorators could make appropriate checks.
  • H: Ops: tag them all!
  • H: Ops: provide a "var" function, which will be important for workflows/workspaces: def variable(ds:Dataset, name:str) -> DataArray AND/OR introduce a common op argument "variable: Union[None, str, List[str]]" which will be used to select variables
  • H: Workflow: add str method to all Nodes, so that command "ect ws status" can display nicely the workspace's workflow state
  • M: extract magic constants from code and identify those which may become ECT configuration settings, mark them by # {{ect-config}}
  • M: Op: replace input property 'required' (bool) by 'position' (int)
  • M: Op: perform same input validation for ops and workflows
  • M: rename Monitor.NULL to Monitor.NONE
  • M: Op: @op(aliases=[name1, ...]) --> new solution: use short name "XXX" for "ect.ops.YYY.XXX"
  • M: util/io/cli: we should harmonize/revise time range usage, e.g. use of TimeRange tuple, use of date and datetime instances
  • L: So far we don't have any own exception types (partly DONE: we have WorkflowError, CommandError now)
  • L: WorkspaceManager: use base directory to resolve relative paths

CLI requirements

Open

  • B: "ect res open" should allow for accessing a local dataset comprising multiple files (#41)
  • H: "ect res plot" should be able to also plot multi-variables, plots shall include images (or "ect res imshow")
  • H: service logging shall not occur on stdout if it is called by other than ect-webapi
  • H: "ect ws list" should list all open workspaces & if they are modified/saved
  • H: "ect res rename N1 N2"
  • H: "ect ds sync ..." --> should be able to configure ECT data root directory, by deafault it is '~/.ect/data_stores' (#41)
  • H: CLI needs auto-completion for many commands that require input of long, unhandy names (data sources, operations, variables)
  • M: "ect help" would lists all the sub-commands at once
  • M: rename "ect ws clean" into "ect ws clear"
  • M: Adapt progress bar length to terminal size, use shutil.get_terminal_size()
  • M: "ect ds del DS" to get rid of locally cached datasets
  • M: "ect ds info DS" should print sync status (temporal coverage required) incl. local data allocation
  • M: "ect ds dashboard" --> ASCII art time coverage overview, cool!
  • L: "ect res read NAME FILE [FORMAT] ..." --> Support reader-specific arguments (...)
  • L: "ect op register [--global|-g] WORKFLOW"
  • L: "ect ds list -q variable:temperature" --> use Lucense/Solr-like query syntax

Done

  • B: when an ect command starts the service and then fails to execute the actual command, the service is still up and running --> auto close required, must detect service inactivity
  • H: "ect res plot" blocks the caller --> service response timeout
  • H: "ect res del"
  • H: "ect run" --> if return type of op is NoneType, we should not write anything to terminal
  • H: "ect res set" shall validate OP arguments #25
  • H: "ect res set" may overwrite existing res, if possible
  • H: bug: "ect op list --tag" still expects wildcard pattern
  • H: Syncing ds should occur automatically when DS[,START[,END]] is used
  • H: Print available format names, ideally detect input and/or output format (reader and/or writer), #17
  • H: "ect ws status [WS]" must print nicely all workflow steps (SoW requirement!)
  • H: "ect op info OP" must print op parameters and return values
  • H: Catch exceptions when calling into ECT API functions, print kind error messages, add -e option to print stack traces
  • H: Print data sources (from ODP) and a data source's variable names
  • H: "ect run --read p=2010_precipitation.nc --write ts.nc ect.ops.timeseries.timeseries ds=p lat=53 lon=10"
  • H: Workspaces: "ect init" "ect read p 2010_precipitation.nc" "ect set ts ect.ops.timeseries.timeseries ds=p lat=53 lon=10 "ect write ts ts.nc
  • M: Harmonize all error messages and their formats
  • M: Use common query syntax to search and explore things that can be listed

WebAPI service requirements

Open

  • H: All "ect res" implementations in the WebAPI must execute asynchronously, therefore WebAPI requires a get_workspace_resource_state() which would be fed into a special Monitor

Done

  • H: service logging shall not occur on stdout if it is called by other than ect-webapi
  • H: WebAPI shall write log file, currently it prints to the console
  • B: when service process starts, its CWD remains the initial one, although the CLI's CWD changes which results in different interpretations of "."
  • H: service exceptions shall be reported by tracebacks

Other

Open

  • M: automated system testing approach (all)
  • H: giant ops clean-up (Janis)
    • filter --> select, also make sure aux-variables are included in the output dataset (option?)
    • make sure all ops use "simple" values which can be easily converted from/to text or JSON
    • tag all operations
    • us variable name 'ds', 'ds_' for xr.Dataset arguments
    • make all unit-tests fast

Done

  • H: Create software installer
  • H: gridtools geom arguments (Norman, Janis)
  • H: 'sphinx_autodoc_annotation', so that Python 3 type annotations go into docs (DONE)
  • H: cli with workspaces (Marco + Norman)