Skip to content
This repository has been archived by the owner on Dec 7, 2023. It is now read-only.

Minimal CLI #1

Closed
wants to merge 229 commits into from
Closed

Minimal CLI #1

wants to merge 229 commits into from

Conversation

cisaacstern
Copy link
Member

@cisaacstern cisaacstern commented Oct 5, 2021

A first draft of CLI functionality related to pangeo-forge/roadmap#37. Local install can be made by running poetry install from the repo root, following the recommendation of https://typer.tiangolo.com/tutorial/package/.

The top-level entry point is

pangeo-forge --help
➜ pangeo-forge --help
Usage: pangeo-forge [OPTIONS] COMMAND [ARGS]...

Options:
  --install-completion  Install completion for the current shell.
  --show-completion     Show completion for the current shell, to copy it or
                        customize the installation.
  --help                Show this message and exit.

Commands:
  bakery
  catalog
  lint     Lint a recipe

where bakery and catalog are typer sub-applications and lint is a top-level function. (Design inspired by pangeo-forge/pangeo-forge-recipes#69; specifics subject to change.)

Currently the bakery sub-application is the only functional component

pangeo-forge bakery --help
➜ pangeo-forge bakery --help
Usage: pangeo-forge bakery [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  ls  List available bakeries and associated build-logs.

Calling ls from this sub-application without arguments returns the list of bakeries from pangeo-forge/bakery-database

 ➜ pangeo-forge bakery ls       
['devseed.bakery.development.aws.us-west-2', 'devseed.bakery.development.azure.westeurope', 'great_bakery']

I've patched in great_bakery at runtime as a stand-in for the forthcoming pangeo-forge/pangeo-forge-gcs-bakery#19

Calling ls with one of the above-listed --bakery-ids returns details about that bakery

pangeo-forge bakery ls --bakery-id 'great_bakery'
➜ pangeo-forge bakery ls --bakery-id 'great_bakery'
{
    'targets': {
        'osn': {
            'anon': True,
            'client_kwargs': {'endpoint_url': 'https://ncsa.osn.xsede.org'},
            'root_path': 's3://Pangeo/pangeo-forge'
        }
    }
}

Further appending --view build-logs to this command fetches the build-logs.json file from the bakery's storage target and renders it as a table

As conceived here, build-logs.json is a JSON object which lives at at the root path of a bakery's storage bucket and records metadata about each dataset built to the bucket in which it resides. A further ADR and community discussion will be needed regarding whether to include such an object in Pangeo Forge and, if so, its specification and implementation details. Briefly, the purpose of the object would be to tie the provenance of datasets in a given storage location back to their corresponding feedstocks and recipes. Among other things, this aids cataloging, insofar as the cataloging tooling (which ideally can generate catalog items based on crawling storage buckets) will need to be aware of information in the feedstocks (i.e. meta.yamls).

pangeo-forge bakery ls --bakery-id 'great_bakery' --view build-logs
┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Run IDTimestampFeedstockRecipe           ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ 000062021-10-01 00:00:00soda342-feedstock@0.1recipe           │
│ 000052021-09-30 00:00:00orca36-feedstock@0.1recipes:surf-fma │
│ 000042021-09-29 00:00:00hycom50-feedstock@0.1recipes:int-aso  │
│ 000032021-09-28 00:00:00gigatl-feedstock@0.1recipes:surf-aso │
│ 000022021-09-27 00:00:00fesom-feedstock@0.1recipes:surf-fma │
│ 000012021-09-26 00:00:00enatl60-feedstock@0.1recipes:int-aso  │
│ 000002021-09-25 00:00:00noaa-oisst-feedstock@0.1recipe           │
└────────┴─────────────────────┴──────────────────────────┴──────────────────┘

s3://Pangeo/pangeo-forge/build-logs.json is a real object I put on OSN for illustration/testing purposes, which is actually loaded at runtime, but feedstock and recipe names here are imaginary.

Finally, the logs can be filtered based on a specific feedstock name (or substring thereof)

pangeo-forge bakery ls --bakery-id 'great_bakery' --view build-logs --feedstock-name gigatl
┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃ Run IDTimestampFeedstockRecipe           ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ 000032021-09-28 00:00:00gigatl-feedstock@0.1recipes:surf-aso │
└────────┴─────────────────────┴──────────────────────────┴──────────────────┘

Update dependencies for pangeo_forge refactor.
Implement adaptive scaling for managing Dask cluster workers.
Remove stray worker count setting.
Update dependencies to use release of pangeo-forge-recipes.
@cisaacstern cisaacstern marked this pull request as draft November 9, 2021 23:56
@cisaacstern
Copy link
Member Author

(Converted this to draft; will re-mark as Ready for review once I've incorporated all comments from first review cycle.)

@cisaacstern
Copy link
Member Author

Thanks to all for your feedback on this. Closing because the design direction has been superseded by #6. The effort here (both in code and reviews) is not lost: I'll certainly be referring back to it extensively in the coming weeks.

@cisaacstern cisaacstern closed this Dec 6, 2021
@cisaacstern cisaacstern deleted the minimal-cli branch July 25, 2022 00:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants