Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor catalog write #587

Merged

Conversation

wrongkindofdoctor
Copy link
Collaborator

@wrongkindofdoctor wrongkindofdoctor commented Jun 7, 2024

Description

  • Refactor write_pp_catalog to populate output catalog columns with information from input catalog and case varlist attributes
  • remove deprecated classes and methods from data_sources and catalog modules
  • add realm back to catalog query
  • fix temp_dir_root definition in filesystem module
    Associated issue # (replace this phrase and parentheses with the issue number)

How Has This Been Tested?
Please describe the tests that you ran to verify your changes in enough detail that
someone can reproduce them. Include any relevant details for your test configuration
such as the Python version, package versions, expected POD wallclock time, and the
operating system(s) you ran your tests on.

Checklist:

  • My branch is up-to-date with the NOAA-GFDL main branch, and all merge conflicts are resolved
  • The scripts are written in Python 3.11 or above (preferred; required if funded by a CPO grant), NCL, or R
  • All of my scripts are in the diagnostics/[POD short name] subdirectory, and include a main_driver script, template html, and settings.jsonc file
  • I have made corresponding changes to the documentation in the POD's doc/ subdirectory
  • I have requested that the framework developers add packages required by my POD to the python3, NCL, or R environment yaml file if necessary, and my environment builds with conda_env_setup.sh
  • I have added any necessary data to input_data/obs_data/[pod short name] and/or input_data/model/[pod short name]
  • My code is portable; it uses MDTF environment variables, and does not contain hard-coded file or directory paths
  • I have provided the code to generate digested data files from raw data files
  • Each digested data file generated by the script contains numerical data (no figures), and is 3 GB or less in size
  • I have included copies of the figures generated by the POD in the pull request
  • The repository contains no extra test scripts or data files

remove deprecated data source classes and functions from data_sources module
add chunk_freq and path to catalog_define_pp_catalog_assets columns
…is used instead of realm

refactor the write_pp_catalog method to populate output catalog information from ingested catalog and varlist attributes instead of inferring information from output file name
@wrongkindofdoctor wrongkindofdoctor added the framework Issue pertains to the framework code label Jun 7, 2024
@wrongkindofdoctor wrongkindofdoctor self-assigned this Jun 7, 2024
…rmation from prior

variable query in preprocessor
@wrongkindofdoctor wrongkindofdoctor merged commit 1f53248 into NOAA-GFDL:main Jun 7, 2024
2 of 4 checks passed
@wrongkindofdoctor wrongkindofdoctor deleted the refactor_catalog_write branch June 7, 2024 19:32
@wrongkindofdoctor wrongkindofdoctor linked an issue Jun 10, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework Issue pertains to the framework code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Catalog consistency MDTF and user data catalog
1 participant