-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement URL Validation for Improved Data Access #47
base: main
Are you sure you want to change the base?
Conversation
Please post a few results from tests demonstrating the new function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
Please
- correct the accidental deletions.
- add the string to the test readme to obtain the test output indicated (also, please put this same information in the PR thread to facilitate review.)
- consider including a standard unit test for the validation script.
Thanks!
@karnesh Please watch for this updated PR.
nwmurl/urlgennwm.py
Outdated
@@ -150,75 +189,15 @@ def select_lead_time(lead_time=None, default=None): | |||
9: "https://ciroh-nwm-zarr-copy.s3.amazonaws.com/national-water-model/", | |||
} | |||
|
|||
retrospective_var_types = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These deletions look accidental.
@@ -0,0 +1,25 @@ | |||
Here's an example of the test case output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the command line string to obtain this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating the PR! @manjirigunaji
- Please clean up the code and remove same functions in urlgennwm.py and validation_util.py.
- Please post few results demonstrating the new function.
- It would be better to have a unit test instead of hard coded test - this can be either done now or in an another PR.
Thanks!
nwmurl/urlgennwm.py
Outdated
valid_file_list = [gevent.spawn(check_url_part, file_name) for file_name in file_list] | ||
gevent.joinall(valid_file_list) | ||
return [file.get() for file in valid_file_list if file.get() is not None] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is also in validation_util.py and imported here as well. Can you please clean up the code and have the function at one place?
nwmurl/urlgennwm.py
Outdated
t.set_description(f"Not Found: {filename}") | ||
t.update(1) | ||
t.refresh() | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to check_valid_urls function, can you please clean up the code and have the function at one place?
generate_urls(start_date, end_date, fcst_cycle, lead_time, varinput, geoinput, runinput, urlbaseinput, meminput) | ||
# Example usage | ||
file_list = create_file_list(runinput, varinput, geoinput, meminput, start_date, end_date, fcst_cycle, urlbaseinput, lead_time) | ||
valid_files = check_valid_urls(file_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please post about the hard-coded test in the main PR message with some kind of ### Test heading?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the PR @manjirigunaji. Please change the hard coded test to a unit test in another PR. @sepehrkrz Please feel free to merge the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good and this PR can be merged.
This pull request introduces a new module, validation_util.py, to handle URL validation within the National Water Model (NWM) utility scripts. The validation_util module includes a check_valid_urls function that validates URLs and filters out any invalid ones, reducing dependencies on tqdm. Additionally, URL validation functionality is to enhance the reliability of accessing data files in the project.
Changes Made:
Tested various scenarios including valid URLs, invalid URLs, and edge cases to verify the robustness of the validation.