Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing MERRA2 data #2700

Open
nicholasbalasus opened this issue Jan 24, 2025 · 5 comments
Open

Missing MERRA2 data #2700

nicholasbalasus opened this issue Jan 24, 2025 · 5 comments
Labels
category: Bug Something isn't working topic: Input Data Related to input data

Comments

@nicholasbalasus
Copy link
Contributor

Your name

Nicholas Balasus

Your affiliation

Harvard University

What happened? What did you expect to happen?

The 0.5° x 0.625° MERRA-2 data is missing for 2018 and 2019 on AWS/WashU.

https://geos-chem.s3.amazonaws.com/index.html#GEOS_0.5x0.625/MERRA2/
http://geoschemdata.wustl.edu/ExtData/GEOS_0.5x0.625/MERRA2/

Image

What are the steps to reproduce the bug?

n/a

Please attach any relevant configuration and log files.

No response

What GEOS-Chem version were you using?

n/a

What environment were you running GEOS-Chem on?

Other (please explain below)

What compiler and version were you using?

n/a

Will you be addressing this bug yourself?

No

In what configuration were you running GEOS-Chem?

Other (please explain in additional information section below)

What simulation were you running?

CH4

As what resolution were you running GEOS-Chem?

0.5 x 0.625

What meterology fields did you use?

MERRA-2

Additional information

No response

@nicholasbalasus nicholasbalasus added the category: Bug Something isn't working label Jan 24, 2025
@msulprizio
Copy link
Contributor

Thanks for reporting Nick. I wonder if this data was accidentally deleted at WashU. I'm tagging @yuanjianz @YanshunLi-washu and/or @yuyao-cyber at WashU may be able to check.

In the meantime, we do have those files at Harvard and on the Harvard maintained gcgrid bucket on AWS. See https://s3.amazonaws.com/gcgrid/index.html#GEOS_0.5x0.625/MERRA2/.

I may also be able to copy those files back to WashU via Globus but there are some barriers with the new scratch system at Harvard so it may take a few hours.

@msulprizio msulprizio added the topic: Input Data Related to input data label Jan 24, 2025
@msulprizio
Copy link
Contributor

@yuanjianz @YanshunLi-washu @yuyao-cyber The files have now been transferred to WashU but the permissions will need to be fixed. Because the Globus endpoint is owned by Randall by the files will be owned by him and read only by default. Yidan had a process in place to fix the permissions and hopefully shared that with you all. Once permissions are fixed, the files should automatically sync to the https://geos-chem.s3.amazonaws.com/index.html#GEOS_0.5x0.625/MERRA2/ within a day.

@yuanjianz
Copy link
Contributor

Thanks @msulprizio, @yidant can you share with us the approach to fix the permission for those files transfered from Globus endpoint?

@msulprizio
Copy link
Contributor

@yuanjianz Yidan provided me with the instructions below a while back for changing permissions, but I was unable to follow because I didn't have the correct group permissions on Compute1 (I just have a guest account).

My steps are:

  1. I'll first change the owner of the files to my account instead of Randall's. The default owner is Randall, and I can't change the permission if so. I just cp the dirs, rm the original, and rename it.
  2. Change the file_dir into the target dir and run this script. change_permission is set to 1. This script will also generate the checksum file for bashdatacatalog. replace is set to 0 so it won't replace the original checksum if there is one.

bashdatacatalog.sh:

#!/bin/bash

file_dir="/storage1/fs1/rvmartin/Active/GEOS-Chem-shared/ExtData/HEMCO/MASKS/v2024-08"
# file_dir="/storage1/fs1/rvmartin/Active/t.yidan/testdir"
replace=0    # 1 for replace the checksum file anyway, 0 otherwise
change_permission=1    # 1 for change file permissions, 0 otherwise

function permission() {
  change_permission=$1

  chmod 644 .assets.md5
  if [ "$change_permission" = 1 ]; then
    chmod 755 $file_dir
    find -type d -exec chmod 755 {} \;
    find -type f -exec chmod 644 {} \;
    echo "Permission changed!"
  fi
}

cd $file_dir
echo "Open $file_dir"

if [ ! -f .assets.md5 ] || [ "$replace" = 1 ]; then
  echo "Generating..."
  find . -type f | xargs md5sum -b | sed 's#\([^ ][^ ]*\)  \(.*\)#\1 *\2#g' | sort --unique --key=2 --field-separator='*' > .assets.md5
  cat .assets.md5
  permission "$change_permission"
  echo ".assets.md5 created!"
else
  cat .assets.md5
  permission "$change_permission"
  echo ".assets.md5 already exists!"
fi

@yantosca
Copy link
Contributor

Thanks @msulprizio, @yidant can you share with us the approach to fix the permission for those files transfered from Globus endpoint?

Could this be set up as a cron job?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working topic: Input Data Related to input data
Projects
None yet
Development

No branches or pull requests

4 participants