Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA script for irradiance data for TA2/3 #756

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

wholmgren
Copy link
Member

This script applies extra quality control to the data used in the TA2/3 evaluations. It downloads the data, processes it, and posts it back to the Arbiter. Posting to the "official" observations requires the reference account. Script will create observations in regular users organization.

@cwhanse wrote the original data processing scripts for each site. I just refactored it into this script.

Not sure where to put this, but I want to finally make the script visible in some way or another.

python irradiance_qa.py --help

  • Closes #xxxx .
  • I am familiar with the contributing guidelines.
  • Tests added.
  • Updates entries to docs/source/api.rst for API changes.
  • Adds descriptions to appropriate "what's new" file in docs/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
  • Maintainer: Appropriate GitHub Labels and Milestone are assigned to the Pull Request and linked Issue.

Comment on lines +590 to +597
if not validated:
logger.info('validation appears hung. reposting data')
session.post_observation_values(o.observation_id, values)
validated = wait_for_validation(session, o, values)
if not validated:
logger.info('validation appears hung. reposting data')
session.post_observation_values(o.observation_id, values)
validated = wait_for_validation(session, o, values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to do this twice? Seems like wait_for_validation does some amount of retrying?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_for_validation retries fetching data from API, which helps with a temporary hang or long queue. But that doesn't help if the job hangs indefinitely. So this re-posts the data. Not sure how many times I should try it.

variable = o.variable.lower()
_data_to_post = data_to_post[site_name][variable]
# Split into chunks to stay under API upload limit.
grouped_data = _data_to_post.groupby(lambda x: (x.year, x.month))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might even split this into weeks? just so that validation jobs are quick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants