Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter sites streamorder meas #56

Merged
merged 16 commits into from
Dec 31, 2022

Conversation

msleckman
Copy link
Collaborator

@msleckman msleckman commented Dec 20, 2022

This is a small PR addressing issue #51

Note - I defined this as filtering in the title and issue, but in reality this is just creation of the stream order col for the nwis disc records dataset. No filtering is done until we visualize this data on maps.

Previously, the Stream Order 3 categorization was only done on the dv sw target p1_nwis_dv_sw_data.
This PR repeats this process for the field measurements (meas) sw dataset. - p1_nwis_meas_sw_data.
I therefore created a function - add_stream_order() which takes the dplyr code chunk that was previously directly in the target chunk for p2_nwis_dv_sw_data, and placed it in a function in sites_along_waterbody.R

In sum:

functions added: add_stream_order()
targets added: p2_nwis_dv_sw_data

Findings:

Number of unique field measurement sites per lake, split by stream order:

p2_nwis_meas_sw_data %>% select(lake_w_state,site_no,stream_order_category) %>% distinct() %>%
group_by(lake_w_state,stream_order_category) %>% summarize(count = n()) %>% 
ggplot(aes(lake_w_state, count))+
geom_bar(aes(fill = stream_order_category), position ='dodge', stat = 'identity')

image

Summary table of sites:

summarize_tbl <-p2_nwis_meas_sw_data %>% count(lake_w_state,stream_order_category, site_no) %>% group_by(lake_w_state, stream_order_category) %>% summarize(site_count = n(), total_meas_across_sites = sum(n), avg_meas_per_site = (total_meas_across_sites/site_count) %>% round(2)) 

summarize_tbl %>% arrange(desc(avg_meas_per_site)) %>% print(n = 27)
# A tibble: 27 × 5
# Groups:   lake_w_state [13]
   lake_w_state       stream_order_category site_count total_meas_across_sites avg_meas_per_site
   <chr>              <chr>                      <int>                   <int>             <dbl>
 1 Walker Lake,NV     along SO 3+                   15                    3449            230.  
 2 Carson Lake,NV     along SO 3+                   42                    6098            145.  
 3 Sevier Lake,UT     along SO 3+                   17                    2125            125   
 4 Pyramid Lake,NV    along SO 3+                   68                    8476            125.  
 5 Great Salt Lake,UT along SO 3+                  102                    9459             92.7 
 6 Carson Lake,NV     not along SO 3+               35                    3169             90.5 
 7 Walker Lake,NV     along lake                     1                      65             65   
 8 Great Salt Lake,UT along lake                    12                     752             62.7 
 9 Pyramid Lake,NV    not along SO 3+               90                    5541             61.6 
10 Walker Lake,NV     not along SO 3+               76                    4543             59.8 
11 Great Salt Lake,UT not along SO 3+               58                    3344             57.7 
12 Pyramid Lake,NV    along lake                     2                     114             57   
13 Carson Lake,NV     along lake                     2                      83             41.5 
14 Sevier Lake,UT     not along SO 3+                7                     253             36.1 
15 Owens Lake,CA      along SO 3+                   20                     542             27.1 
16 Mono Lake,CA       not along SO 3+                4                      83             20.8 
17 Mono Lake,CA       along SO 3+                    7                      85             12.1 
18 Malheur Lake,OR    along SO 3+                   38                     276              7.26
19 Owens Lake,CA      not along SO 3+               12                      76              6.33
20 Franklin Lake,NV   not along SO 3+                1                       6              6   
21 Malheur Lake,OR    not along SO 3+                8                      37              4.62
22 Honey Lake,CA      along SO 3+                    5                       8              1.6 
23 Lake Abert,OR      along SO 3+                    7                      11              1.57
24 Goose Lake,CA      along SO 3+                    1                       1              1   
25 Honey Lake,CA      not along SO 3+                3                       3              1   
26 Lake Abert,OR      not along SO 3+                1                       1              1   
27 Summer Lake,OR     not along SO 3+                1                       1              1   

Discrete sw measurements coverage over time (ignoring SO category):

  • streamflow (cf/s)
    image

  • gage height (ft) - note gaps
    image

code for scatterplots

library(ggplot2)
library(tidyverse)

targets::tar_load(p2_nwis_meas_sw_data)

top_5_lakes_meas_coverage<- c('Great Salt Lake,UT','Sevier Lake,UT','Walker Lake,NV','Pyramid Lake,NV','Carson Lake,NV')

plot_fun <- function(data, col){
  data %>%
    select(lake_w_state, stream_order_category, measurement_dt, discharge_va, gage_height_va) %>%
    mutate(measurement_dt = as.Date(measurement_dt)) %>% 
    ggplot(aes(y = .data[[col]], x = measurement_dt))+
    geom_point(aes(color = lake_w_state), size = 0.7)+ theme_bw()
}

sw_meas_time_plot_fun <- function(meas_data, measurement_col, split_vec){
  plot_top <- plot_fun(data = meas_data %>%
                         filter(lake_w_state %in% split_vec),
                       col = measurement_col)
  plot_rest <- plot_fun(data = meas_data %>%
                          filter(!lake_w_state %in% split_vec),
                        col = measurement_col)
  cowplot::plot_grid(plot_top, plot_rest, nrow = 2)
  
}

# streamflow
sw_meas_time_plot_fun(meas_data = p2_nwis_meas_sw_data, measurement_col = 'discharge_va', split_vec = top_5_lakes_meas_coverage)
# gage_height
sw_meas_time_plot_fun(meas_data = p2_nwis_meas_sw_data, measurement_col = 'gage_height_va', split_vec = top_5_lakes_meas_coverage)

@msleckman
Copy link
Collaborator Author

msleckman commented Dec 20, 2022

SW sites with stream order breakdown:
image

Note: Most sites have less than 100 records on the sw side.
image

@msleckman msleckman requested a review from cnell-usgs December 21, 2022 19:47
@msleckman msleckman marked this pull request as ready for review December 21, 2022 19:48
msleckman added a commit that referenced this pull request Dec 21, 2022
missed changing this `p4_nwis_meas_sw_data_rds` target in #56 . This target which now is derived from `p2_nwis_meas_sw_data` and not `p1_nwis_meas_sw_data` since we process this latter target and added stream_order_category col.
@msleckman
Copy link
Collaborator Author

msleckman commented Dec 29, 2022

Discovered that there is a merge conflict because main no longer has 4_export.R, mapping/saline_lakes_mapping_script.R and saline_lakes_mapping_script.R. These files were removed in order to merge #58. However it causes conflict with #56 now.

Resolved by merging #60 first to main (commit msleckman@9b78a4b) The conflict is no longer present in this PR, nor in #58 since the commit histories are now in sync.

msleckman added a commit to msleckman/saline-lakes that referenced this pull request Dec 29, 2022
msleckman added a commit that referenced this pull request Dec 29, 2022
…der_meas

adding old files to enable merge of #56
@cnell-usgs
Copy link
Member

Is there meaning in the split in lakes in this plot?
image

Copy link
Member

@cnell-usgs cnell-usgs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ran for me but I'd like you to address the comments below before merging. The most important is to have tidy, well-defined, targets. It would be better to create a site-level target that includes the streamorder variable and other site-level info, rather than adding directly to the long form data. This will make the pipeline more modular, and the site-level target will be useful in subsequent targets.

The comments about the buffer accuracy are important, but can be ignored for now.

2_process/src/sites_along_waterbody.R Show resolved Hide resolved
2_process/src/sites_along_waterbody.R Show resolved Hide resolved
Copy link
Member

@cnell-usgs cnell-usgs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok I see my suggestion of creating a site-level target is partially addressed in #58

Address minor comments and go ahead and merge, those comments can be addressed on next PR

@msleckman
Copy link
Collaborator Author

Is there meaning in the split in lakes in this plot? image

The split was to help vis the data. The top plot has the lakes with the most complete data, the bottom plot has the vis with the less complete data. See code dropdown at bottom of PR description.

@msleckman
Copy link
Collaborator Author

pipeline runs with latest changes following review. Merging.

@msleckman msleckman merged commit dea3704 into USGS-VIZLAB:main Dec 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants