Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Allow custom CF url (-d) #1066

Closed
wants to merge 8 commits into from

Conversation

jcermauwedu
Copy link

Allow for a custom CF url for checking standards. Allows for checking files with a custom CF xml file with proposed changes to ensure operability prior to adoption.

Example of use:

$ compliance-checker -t gliderdac:3.0 -O gliderdac:no_ancillary_variables -d https://acoustics.fish.washington.edu/research/cf/cf-standard-name-table.xml gutils/rt/netcdf/unit_507_1712092237_20240402T211037Z_rt.nc
Downloading cf-standard-names table version "url specified" from: https://acoustics.fish.washington.edu/research/cf/cf-standard-name-table.xml
Running Compliance Checker on the datasets from: ['gutils/rt/netcdf/unit_507_1712092237_20240402T211037Z_rt.nc']


--------------------------------------------------------------------------------
                         IOOS Compliance Checker Report                         
                     Version 5.1.2.dev3+gc17f5f6.d20240501                      
                     Report generated 2024-05-01T06:03:43Z                      
                                 gliderdac:3.0                                  
      https://ioos.github.io/glider-dac/ngdac-netcdf-file-format-version-2      
--------------------------------------------------------------------------------
All tests passed!

This comes with a bug to track down: the program needs to be run twice. Once to download the custom XML and a second time to actually use it. So, there is actually a bug when requesting a different version of the CF standard names as well.

@jcermauwedu
Copy link
Author

Will go back and tackle the lint errors.

Copy link

codecov bot commented May 1, 2024

Codecov Report

Attention: Patch coverage is 30.00000% with 7 lines in your changes are missing coverage. Please review.

Project coverage is 81.99%. Comparing base (064aac9) to head (9069b06).
Report is 20 commits behind head on develop.

Files Patch % Lines
compliance_checker/cf/util.py 25.00% 5 Missing and 1 partial ⚠️
compliance_checker/runner.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1066      +/-   ##
===========================================
- Coverage    82.18%   81.99%   -0.20%     
===========================================
  Files           24       24              
  Lines         5171     5181      +10     
  Branches      1242     1248       +6     
===========================================
- Hits          4250     4248       -2     
- Misses         622      633      +11     
- Partials       299      300       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jr3cermak added 6 commits May 12, 2024 18:04
 * Convert print statements to f-strings
 * add-trailing-commas fix for cf/util.py
 * adjust requirement.txt files to conform to plugin format
   * remove pinning; works with python 3.11
 * Update pre-commit-config with requirements files
 * Fix sorting on requirements.txt
 * Add pre-commit for local testing first; add to requirements-dev.txt
@jcermauwedu
Copy link
Author

We will look into adding some more tests to satisfy codecov. In the meantime, the pre-commit.ci failure is in conflict with a local run of pre-commit.

Result from pre-commit.ci:
PreCommit

Local run:
localPreCommit

Will try: pre-commit.ci autofix to see what that does.

@jcermauwedu
Copy link
Author

pre-commit.ci autofix

@ocefpaf
Copy link
Member

ocefpaf commented May 14, 2024

@jcermauwedu I would ignore pre-commit-ci for now. It seems to be a bit wacky. I'll try to update/config it better in the near future. Let's focus on your change to the code base here so we can get this reviewed and merged ASAP.

@@ -19,7 +19,7 @@ jobs:
create-args: >-
python=3 pip
--file requirements.txt
--file test_requirements.txt
--file requirements-test.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes like this make it hard to follow the actual changes in the PR. While I personally prefer the requirements-dev.txt we are stuck with the original file name here and no need to change it now.

PS: I'd welcome a change like this in its own PR, away from important code changes that require careful review.

regex
requests
shapely
validators
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to remove the lower pinned version? Most, if not all of those, if they make into an environment, will likely break compliance-checker. Upper pins are bad IMO but lower are fine, they not only ensure a stable dev env but also increase solver speed for most package managers.

url = f"http://cfconventions.org/Data/cf-standard-names/{version}/src/cf-standard-name-table.xml"
if version.startswith('http'):
url = version
version = '"url specified"'
Copy link
Member

@ocefpaf ocefpaf May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could say "custom URL"? Although, if this variable is not used anywhere we can probably remove it here.

@@ -289,7 +289,13 @@ def download_cf_standard_name_table(version, location=None):
if version == "latest":
url = "http://cfconventions.org/Data/cf-standard-names/current/src/cf-standard-name-table.xml"
else:
url = f"http://cfconventions.org/Data/cf-standard-names/{version}/src/cf-standard-name-table.xml"
if version.startswith('http'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcermauwedu it seems that your core changes are in this commit, right? This is an easy merge IMO if you don't mind submitting just that in this PR and the rest in a linting one.

@jcermauwedu
Copy link
Author

Thank you for the feedback.

@ocefpaf
Copy link
Member

ocefpaf commented May 14, 2024

@jcermauwedu would it make sense to also allow for local xml files? I don't have a use case for that but maybe we should not restrict it here to files on the web.

@jcermauwedu
Copy link
Author

@jcermauwedu would it make sense to also allow for local xml files? I don't have a use case for that but maybe we should not restrict it here to files on the web.

This makes sense too. There is a lot of chicken and egg things going on here. The checker is initialized before the arguments are parsed. When the checker is initialized it has already read the standard names file. Then the arguments are processed and updates the file on the disk, but the test will use what has already been read in memory. IE: We have to find a way send a signal to the tests that the user desired a different standard name table. Or defer loading of the standard table in the class right up until the first test is run. Somewhere...

I was starting to implement a --standard-names-url option. I can add a --standard-names-file option for a local filesystem file. I can create a use case for both and tests.

I have reverted the other items and will put in a separate PR if needed. Pinning packages on the minimum side makes sense. I am also learning a bit about python coverage. This PR might extend into the sprint.

@ocefpaf
Copy link
Member

ocefpaf commented May 14, 2024

There is a lot of chicken and egg things going on here. The checker is initialized before the arguments are parsed.

That is a bug IMO. If we can tackle that during the sprint, it is a success already. (Although we have a lot in our plates for this sprint already ;-p)

@jcermauwedu
Copy link
Author

Turns out this feature already exists via an environment variable. We will revise the PR to provide additional documentation to highlight the feature. The code could be improved to see if there is an http or https prefix to attempt a remote url fetch.

$ export CF_STANDARD_NAME_TABLE=cf-standard-name-table.xml
$ compliance-checker -t gliderdac:3.0 -O gliderdac:no_ancillary_variables unit_507_1715633718_20240513T205518Z_rt.nc
Running Compliance Checker on the datasets from: ['unit_507_1715633718_20240513T205518Z_rt.nc']


--------------------------------------------------------------------------------
                         IOOS Compliance Checker Report                         
                     Version 5.1.2.dev9+g9069b06.d20240516                      
                     Report generated 2024-05-16T01:31:13Z                      
                                 gliderdac:3.0                                  
      https://ioos.github.io/glider-dac/ngdac-netcdf-file-format-version-2      
--------------------------------------------------------------------------------
All tests passed!

@benjwadams
Copy link
Contributor

Do we want to close this as not planned in favor of documentation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants