Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the input data that to be used for "Bare Minimum" tests #2701

Open
Daisy0419 opened this issue Jan 24, 2025 · 6 comments
Open

What is the input data that to be used for "Bare Minimum" tests #2701

Daisy0419 opened this issue Jan 24, 2025 · 6 comments
Assignees
Labels
category: Question Further information is requested topic: GCHP Related to GCHP only topic: Input Data Related to input data

Comments

@Daisy0419
Copy link

Your name

Daisy Wang

Your affiliation

Washington University in St. Louis

Please provide a clear and concise description of your question or discussion topic.

Hi, on your user guide (https://gchp.readthedocs.io/en/stable/getting-started/requirements.html), it mentioned that ghcp can be run with Bare Minimum Requirements:

  • 6 cores
  • 32 GB of memory
  • 100 GB of storage for input and output data
  • Running GCHP on one node with as few as six cores is possible but we recommend this only for testing short low-resolution runs such as running GCHP for the first time and for debugging. These bare minimum requirements are sufficient for running GCHP at C24. Please note that we recommend running at C90 or greater for scientific applications.

Could you please indicate what input data (including MET, CHEM and HCO) should be used for this scenario for test? I am not able to figure it out.

@Daisy0419 Daisy0419 added the category: Question Further information is requested label Jan 24, 2025
@yantosca yantosca added topic: GCHP Related to GCHP only topic: Input Data Related to input data labels Jan 27, 2025
@yantosca
Copy link
Contributor

Tagging @lizziel

@lizziel
Copy link
Contributor

lizziel commented Jan 27, 2025

Hi @Daisy0419, the data inputs are the same whether you do a bare minimum run on 6 cores versus a multi-node run with hundreds of cores. If you want to make your run as lightweight as possible I recommend running at low resolution, e.g. c24.

Are you part of Randall Martin's group at WashU?

@lizziel lizziel self-assigned this Jan 27, 2025
@Daisy0419
Copy link
Author

Hi @lizziel , thank you so much for your prompt response.

Yes, I am current working on the project with Randall Martin's group. I am looking to run gchp on a local machine (which I have root access) so as to get a more detailed profiling . But I am not sure what is the minimum set of data should be downloaded. When using bashdatacatalog to download data, even when I choose the lowest resolution with only one day, it still downloads hundreds Gig data, specially, HCO data seems to be downloaded for a wide time range regardless of the requested time range. Moreover, the downloaded data through bashdatacatalog seem to be incomplete which didn't work.

Could you please provide a guide on which data should I download if I want to run a local machine.

Thank you very much.

@msulprizio
Copy link
Contributor

Hi @Daisy0419. Are you aware of the ACAG documentation? That will point you to the data already accessible by Randall's group on Compute1.

@Daisy0419
Copy link
Author

Daisy0419 commented Jan 28, 2025

Hi @msulprizio, compute1 is our main computing resource. But given this is a cluster, we don't have root access and can't profile the code from a hardware level, this is the reason I want to run on a local machine. Thanks :)

@lizziel
Copy link
Contributor

lizziel commented Jan 28, 2025

Hi @Daisy0419, it is possible there is an issue with bashdatacatalog. Could you give more information, including what data range you are downloading for, what data is downloaded, and what data is missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Question Further information is requested topic: GCHP Related to GCHP only topic: Input Data Related to input data
Projects
None yet
Development

No branches or pull requests

4 participants