Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI for preparing / validating recipes #13

Open
rabernat opened this issue Jul 13, 2022 · 3 comments
Open

CLI for preparing / validating recipes #13

rabernat opened this issue Jul 13, 2022 · 3 comments

Comments

@rabernat
Copy link
Contributor

User Profile

As recipe contributor

User Action

I want to test my recipe on the command line before submitting a pull request

User Goal

so that I can avoid a slow debugging cycle talking to the pangeo forge bot on github. (Example: pangeo-forge/staged-recipes#150)

Acceptance Criteria

I run a command like

$ pangeo-forge recipe validate recipe_folder/

and see output like

It looks like your meta.yaml does not conform to the specification.

            1 validation error for MetaYaml
    pangeo_notebook_version
      field required (type=value_error.missing)

or

When I tried to import your recipe module, I encountered this error

            line 17, in <module>
        fs.ls(url_base + str(year), detail=False)
    NameError: name 'fs' is not defined

Please correct your recipe module so that it's importable.

Linked Issues

No response

@derekocallaghan
Copy link

Hi @rabernat, as discussed in the Oct 24 Pangeo Forge meeting, here are a few thoughts about what might be required for implementation. These are based on recent local debugging of staged recipes. Although the Pangeo Forge documentation recommends locally running a pruned version of a recipe, this won't always detect common staged recipe issues e.g.

  • As mentioned above:
    • meta.yaml issues
    • Recipe import issues
  • Issues encountered when running a recipe with Beam:
    • Hash serialization
    • Pickler serialization
    • etc

As pangeo-forge-runner provides a CLI for feedstock loading and recipe baking, involving recipe loading, running with Beam etc., it might be useful to extend it to add a validate command such as the one you mentioned above. From looking through the code, the following changes/steps might be a starting point:

{'Bake': {'bakery_class': 'pangeo_forge_runner.bakery.local.LocalDirectBakery',
  'recipe_id': 'eooffshore_ics_ccmp_v02_1_nrt_wind',
  'feedstock_subdir': 'recipes/eooffshore_ics_ccmp_v02_1_nrt_wind',
  'repo': 'https://github.com/eooffshore/staged-recipes',
  'ref': '663f30c95c406b9efe012b9bae66fa1f386b539b',
  'job_name': 'CCMP',
  'prune': True},
 'LocalDirectBakery': {'num_workers': 1},
 'TargetStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
  'root_path': './ccmp.zarr'},
 'InputCacheStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
  'root_path': './input-cache/'},
 'MetadataCacheStorage': {'fsspec_class': 'fsspec.implementations.local.LocalFileSystem',
  'root_path': './metadata-cache/'}}
  • A config similar to the above would then be used to run a pruned Bake, e.g. Bake(config=Config(bconfig)).start(). If this successfully ran to completion, it would ensure that the meta.yaml has been successfully loaded, the recipe(s) imported by Feedstock, executed with local Beam ("multi_processing" mode) etc.
  • A step would be added to load the resulting output e.g. Zarr store (I was doing this manually and performing some simple checks e.g. printing the time dimension)
  • In the example config above, a Git repo has been specified. The Validate command would instead use the specified local recipe path (containing feedstock_subdir). However, it looks like local repos/recipes aren't currently supported by Bake, so this would need to implemented - I assume it's related to the use of contentproviders.Local in pangeo_forge_runner.commands.base.BaseCommand. I'd originally tried this locally but it didn't work due to an error I can't recall. It seems useful to support local recipe paths enabling validation prior to repo commits.
  • Test cases would be created for the Validate implementation.
  • The Pangeo Forge documentation would be updated to document use of the pangeo_forge_runner validate... command.

I should be free for next week's Pangeo Forge meeting call if you want to discuss the above suggestions.

@yuvipanda
Copy link

+1 for moving the tutorial to use pangeo-forge-runner to test things out vs what is there now. The runner project is much newer than recipes, so a lot of the split here is just 'where things are' rather than 'how it ought to be'

@derekocallaghan
Copy link

+1 for moving the tutorial to use pangeo-forge-runner to test things out vs what is there now. The runner project is much newer than recipes, so a lot of the split here is just 'where things are' rather than 'how it ought to be'

If you like, I can take a look at adding a Validate command I suggested above into pangeo-forge-runner, I've largely been doing this locally already with calls to Bake()...start().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants