-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleaning discrete gw sw meas data #58
Cleaning discrete gw sw meas data #58
Conversation
currently have a merge conflict due to deletion of 4_export.R . This file no longer exists in this branch but exists in upstream main.
|
Managed via a strange work around in this PR #59. Pipeline runs. 4_export.R no longer in repo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think rather than creating long form targets, we need site-level targets that gather site info like location, streamorder, data type, lake, etc. I suggest eitther (1) creating separate targets for each data type at the site-level, or (2) creating a master site-level target that provides this information for all sites, including what type of data are available there. This will be very useful for developing summaries for each site, lake, and making visuals.
2_process_sw_gw_site_data.R
Outdated
join_site_col = 'site_no') %>% | ||
## both dfs have a site_tp_cd col so when joining, two versions are created. Resetti | ||
mutate(site_tp_cd = site_tp_cd.y) %>% | ||
select(!contains(c('.x','.y'))) %>% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're getting these because you're defining the join to one column. It would be better to allow all matching columns be used in the join (e.g. not providing a specific join col), unless there is reason to expect that same-named cols are different between the joined objects and both should exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion. I'll change such that site_no
& site_tp_cd
are part of the join.
Also do not commit the exported data files to the repo |
Do you have a example for this e.g. example of what these site-level targets would look like? Can this tidying task fall into a new PR and can I merge this? I can merge after addressing ⬇️ and the new merge conflict that has come up since the rename commits were pushed to main in #56)
|
Sure go ahead and merge once the conflicts are addressed. I will work on adding the site-level target. |
By tidy site-level target I mean a dataframe where there is a single row for each site, and related info in the columns. It would be useful to store important site info, like streamorder |
Following a lengthy process of attempting to resolve merge conflicts that appeared on three files This PR still did not make the conflict in Will add via a new PR #67
|
This PR adds several adjustments to capture all discrete nwis sw and gw data for visualization purposes.
Associated with issue #57 and must be merged after #56
Changes:
renamed
sites_along_waterbody.R
-->process_nwis_data.R
renamed
4_exports
folder -->4_outputs
new targets:
p2_nwis_meas_gw_data
new functions:
standardized
add_stream_order()
for adding stream order category to discrete and continuous (dv) sw sitesNew more standard
join_site_spatial_info()
that performs spatial join with targetp2_nwis_sites_in_watersheds_sf
(i.e. all sites in watersheds) and give info the data records dataset such aslat
lon
andsite_type
discrete data outputs
4_outputs/out/p4_nwis_meas_sw_data_rds
4_outputs/out/p4_nwis_meas_gw_data_rds
numbers: (ref gist L109)
Over 3,000 discrete sites for gw collected a total of 64435 data records, while only 635 discrete sites for sw collecting a total of 48083.