Cleaning discrete gw sw meas data #58

msleckman · 2022-12-22T19:52:30Z

This PR adds several adjustments to capture all discrete nwis sw and gw data for visualization purposes.

Associated with issue #57 and must be merged after #56

Changes:

renamed sites_along_waterbody.R --> process_nwis_data.R
renamed 4_exports folder --> 4_outputs
new targets:
p2_nwis_meas_gw_data

new functions:
standardized add_stream_order() for adding stream order category to discrete and continuous (dv) sw sites
New more standard join_site_spatial_info() that performs spatial join with target p2_nwis_sites_in_watersheds_sf (i.e. all sites in watersheds) and give info the data records dataset such as lat lon and site_type

discrete data outputs
4_outputs/out/p4_nwis_meas_sw_data_rds
4_outputs/out/p4_nwis_meas_gw_data_rds

numbers: (ref gist L109)

> ## count of individual measurements
> meas_sw_clean %>% count() %>% pull(n)
[1] 48083
> meas_gw_clean %>% count() %>% pull(n)
[1] 64435
> ## count of unique sites
> summarized_meas_gw %>% ungroup() %>%  count() %>% pull(n)
[1] 3322
> summarized_meas_sw %>% ungroup() %>%count() %>% pull(n)
[1] 635

Over 3,000 discrete sites for gw collected a total of 64435 data records, while only 635 discrete sites for sw collecting a total of 48083.

… function

…script

msleckman · 2022-12-22T22:14:45Z

currently have a merge conflict due to deletion of 4_export.R . This file no longer exists in this branch but exists in upstream main.

To resolve

msleckman · 2022-12-23T22:43:46Z

currently have a merge conflict due to deletion of 4_export.R . This file no longer exists in this branch but exists in upstream main.
* [x]  To resolve

Managed via a strange work around in this PR #59. Pipeline runs. 4_export.R no longer in repo

cnell-usgs

Again, I think rather than creating long form targets, we need site-level targets that gather site info like location, streamorder, data type, lake, etc. I suggest eitther (1) creating separate targets for each data type at the site-level, or (2) creating a master site-level target that provides this information for all sites, including what type of data are available there. This will be very useful for developing summaries for each site, lake, and making visuals.

cnell-usgs · 2022-12-30T01:59:01Z

2_process_sw_gw_site_data.R

+                           join_site_col = 'site_no') %>% 
+      ## both dfs have a site_tp_cd col so when joining, two versions are created. Resetti
+      mutate(site_tp_cd = site_tp_cd.y) %>% 
+      select(!contains(c('.x','.y'))) %>% 


you're getting these because you're defining the join to one column. It would be better to allow all matching columns be used in the join (e.g. not providing a specific join col), unless there is reason to expect that same-named cols are different between the joined objects and both should exist.

Good suggestion. I'll change such that site_no & site_tp_cd are part of the join.

cnell-usgs · 2022-12-30T02:17:04Z

Also do not commit the exported data files to the repo

msleckman · 2023-01-03T16:27:35Z

Again, I think rather than creating long form targets, we need site-level targets that gather site info like location, streamorder, data type, lake, etc. I suggest eitther (1) creating separate targets for each data type at the site-level, or (2) creating a master site-level target that provides this information for all sites, including what type of data are available there. This will be very useful for developing summaries for each site, lake, and making visuals.

Do you have a example for this e.g. example of what these site-level targets would look like? Can this tidying task fall into a new PR and can I merge this? I can merge after addressing ⬇️ and the new merge conflict that has come up since the rename commits were pushed to main in #56)
I agree that these more modular summarizing targets would be really helpful downstream, but feel it should be a separate PR.

Also do not commit the exported data files to the repo
Ah, this was a accidental push. My error. I changed items in the .gitignore and this ended up getting pushed.

cnell-usgs · 2023-01-03T16:30:05Z

Sure go ahead and merge once the conflicts are addressed. I will work on adding the site-level target.

cnell-usgs · 2023-01-03T16:38:50Z

By tidy site-level target I mean a dataframe where there is a single row for each site, and related info in the columns. It would be useful to store important site info, like streamorder

…8552

msleckman · 2023-01-03T22:16:29Z

Following a lengthy process of attempting to resolve merge conflicts that appeared on three files 2_process_site_data.R, 2_process_lake_tribs.R, and 12_process/src/saline_lakes_waterbody.R` which were ultimately removed in this PR to resolve #66

This PR still did not make the conflict in 2_process_lake_tribs.R go away. As a result I have deleted it in the branch (renamed it to a file that is not being tracked by this PR) in this commit 0203c13.

Will add via a new PR #67

Verify that USGS-R/main run after mergin this PR Cleaning discrete gw sw meas data #58 and the Add 2 process lakes tribs agian #67.

…f cee review in USGS-VIZLAB#58

msleckman added 10 commits December 20, 2022 10:54

adding meas sites to streamorder 3 categorization process

c291fef

adding commenting to p2_nwis_dv_sw_data

3153055

adding p2_nwis_meas_sw_data

7117ee4

transformed process of adding stream order cols to sw data into a new…

934c675

… function

rename of add_stream_order param

673495b

updated 4_outputs folder name and updated gitignore

0e224e1

editing/splittingcleaning of sw and gw functions

8df426d

streamlining changes with cleaning of gw meas data

c31f13d

rnamed sites_in_waterbody script to more general process_nwis_data.R …

6322758

…script

rname 4_export.R to 4_outputs + rm 4_reports and updated commenting

aef42ff

msleckman marked this pull request as ready for review December 22, 2022 19:55

msleckman requested a review from cnell-usgs December 22, 2022 20:09

adding correct sourcing following changes in filenames;

1de4825

msleckman added 3 commits December 23, 2022 10:16

commenting

30bcfde

alignment

8b56d2f

merging

541d70a

msleckman mentioned this pull request Dec 29, 2022

Filter sites streamorder meas #56

Merged

cnell-usgs requested changes Dec 30, 2022

View reviewed changes

msleckman mentioned this pull request Dec 30, 2022

GW should include any well with two or more discrete measurements in a given year. #61

Open

cnell-usgs approved these changes Jan 3, 2023

View reviewed changes

msleckman added 4 commits January 3, 2023 09:01

renaming 2_process_data.R

aa78552

spacing in proces_nwis_data.R

6f0a3a0

adding 2_process_site_data.R because was deleted when renaming in aa7…

f7511a4

…8552

so3 naming additions

9744076

msleckman added 3 commits January 3, 2023 10:40

rs formatting causing mg conflict

367f159

.

e42aa22

update targets list name without sw or gw

90c82bb

padilla410 mentioned this pull request Jan 3, 2023

Remove files that are causing conflicts #66

Merged

msleckman added 4 commits January 3, 2023 13:56

take off space

8dc976c

deleted lake tribs R file to enable merge

4d9df23

re-adding 2_process_lakes_tribs to see if cnflict re-appears

60f2e80

nvmd, merge conflicts it removing again

0203c13

msleckman merged commit abd9e5f into USGS-VIZLAB:main Jan 3, 2023

msleckman mentioned this pull request Jan 3, 2023

Add 2 process lakes tribs agian #67

Merged

msleckman added a commit to msleckman/saline-lakes that referenced this pull request Jan 4, 2023

resolving duplicate site_tp col from joining. Address final comment o…

f125ce1

…f cee review in USGS-VIZLAB#58

This was referenced Jan 7, 2023

Msleckman cleaning discrete gw sw meas data #65

Closed

Filling data gap with discrete gw measurement #57

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleaning discrete gw sw meas data #58

Cleaning discrete gw sw meas data #58

msleckman commented Dec 22, 2022 •

edited

Loading

msleckman commented Dec 22, 2022 •

edited

Loading

msleckman commented Dec 23, 2022 •

edited

Loading

cnell-usgs left a comment

cnell-usgs Dec 30, 2022

msleckman Dec 30, 2022

cnell-usgs commented Dec 30, 2022

msleckman commented Jan 3, 2023 •

edited

Loading

cnell-usgs commented Jan 3, 2023

cnell-usgs commented Jan 3, 2023

msleckman commented Jan 3, 2023

Cleaning discrete gw sw meas data #58

Cleaning discrete gw sw meas data #58

Conversation

msleckman commented Dec 22, 2022 • edited Loading

Changes:

numbers: (ref gist L109)

msleckman commented Dec 22, 2022 • edited Loading

msleckman commented Dec 23, 2022 • edited Loading

cnell-usgs left a comment

Choose a reason for hiding this comment

cnell-usgs Dec 30, 2022

Choose a reason for hiding this comment

msleckman Dec 30, 2022

Choose a reason for hiding this comment

cnell-usgs commented Dec 30, 2022

msleckman commented Jan 3, 2023 • edited Loading

cnell-usgs commented Jan 3, 2023

cnell-usgs commented Jan 3, 2023

msleckman commented Jan 3, 2023

msleckman commented Dec 22, 2022 •

edited

Loading

msleckman commented Dec 22, 2022 •

edited

Loading

msleckman commented Dec 23, 2022 •

edited

Loading

msleckman commented Jan 3, 2023 •

edited

Loading