Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update time to 2019 mega-issue #108

Open
jordansread opened this issue Mar 17, 2020 · 0 comments
Open

Update time to 2019 mega-issue #108

jordansread opened this issue Mar 17, 2020 · 0 comments

Comments

@jordansread
Copy link

cache/allSitesYears.csv

This files contains the year-specific information about various sites. This file was built externally (I think by Lindsay or David) and hosted on sciencebase because the act of building it took a very long time (it has to pull a ton of NWIS data from the services). I think we need to rebuild this file completely, or at least replace 2016-2019. Likely the whole thing since other earlier data in NWIS may have changed. The file looks like this:

read.csv('cache/allSitesYears.csv') %>% head
  site_no year agency_cd site_tp_cd                            station_nm dec_lat_va dec_long_va  huc_cd data_type_cd parm_cd begin_date   end_date count_nu dayRange
1 1010070 1984      USGS         ST Big Black River near Depot Mtn, Maine   46.89389   -69.75167 1010001           dv      60 1983-10-01 2017-03-07    12208    12211
2 1010070 1985      USGS         ST Big Black River near Depot Mtn, Maine   46.89389   -69.75167 1010001           dv      60 1983-10-01 2017-03-07    12208    12211
3 1010070 1986      USGS         ST Big Black River near Depot Mtn, Maine   46.89389   -69.75167 1010001           dv      60 1983-10-01 2017-03-07    12208    12211
4 1010070 1987      USGS         ST Big Black River near Depot Mtn, Maine   46.89389   -69.75167 1010001           dv      60 1983-10-01 2017-03-07    12208    12211
5 1010070 1988      USGS         ST Big Black River near Depot Mtn, Maine   46.89389   -69.75167 1010001           dv      60 1983-10-01 2017-03-07    12208    12211
6 1010070 1989      USGS         ST Big Black River near Depot Mtn, Maine   46.89389   -69.75167 1010001           dv      60 1983-10-01 2017-03-07    12208    12211
  intDaysRecord intDaysAll diff nDays
1         12206      12210    4    NA
2         12206      12210    4    NA
3         12206      12210    4    NA
4         12206      12210    4    NA
5         12206      12210    4    NA
6         12206      12210    4    NA

cache/disch-sites.rds

This file contains only flattened (no time dimension) information about each site used in the map.

readRDS('cache/disch-sites.rds')
# A tibble: 22,274 x 4
   site_no  huc   dec_lat_va dec_long_va
   <chr>    <chr>      <dbl>       <dbl>
 1 01010000 01          46.7       -69.7
 2 01010070 01          46.9       -69.8
 3 01010500 01          47.1       -69.1
 4 01011000 01          47.1       -69.1
 5 01012500 01          47.2       -68.6
 6 01012515 01          46.7       -68.8
 7 01012520 01          46.7       -68.8
 8 01012525 01          46.7       -68.8

It depends on cache/allSitesYears.csv, and is a filtered and summarized version of that file.

cache/site-map.rds

This file is a spatial object in R (sp, SpatialPointsDataFrame spatial object classes) which contains dot/site locations in "map space". The object has spatial information, and then only one attribute, which is the site_no. If we add new sites with the data update, which I expect we will, this object will need to be rebuilt. It relies on cache/disch-sites.rds and can be built with this function

target/data/year-data.json

This is the main time information dataset, but it is built in a compressed and almost unreadable way in order to make it much smaller. This JSON has the job of telling the map which dots to draw for a given year, but doing directly would make the json file huge (we have over one hundred years of site locations). Instead, it is a "diff" of losses and gains of sites, specified by gn for gain and ls for loss:

"1890":{"gn":[774,775],"ls":[]},"1891":{"gn":[100,516],"ls":[]}

Sites are also grouped into X number of groups (currently 23) which each have 1000 or fewer sites. This file is built in cache/year-data.json and then moved to target/data/year-data.json in the publish phase. It depends on cache/site-map.rds and cache/allSitesYears.csv. You can build this file using this function

if successful, this file will add sites to the JSON fields for more recent years, e.g., "2018":{"gn":[],"ls":[454]},"2019":{"gn":[],"ls":[]}

cache/bar-data.xml

This xml file is injected into the DOM (as far as I know...) to be the "bars" at the bottom of the map. It relies on cache/state-map.rds (just to get the size of the SVG it seems) and cache/allSitesYears.csv so that it can get the year totals. note to team I noticed that different site filtering is done here, vs what is done in site-map and disch-sites to remove sites...this one simply sums them, so I think we get totals might include a few that aren't shown in the map. Flag this for later.

This seemed to build fine with updated time using this function

figures/states_map.svg

This is the big goofy/funky one, and is the largest departure from how we currently build maps. This is a large svg that includes all of the point locations, the state polygons, the bar chart, and the USGS watermark. This work is pre-D3 and pre-mapbox, so it builds an SVG in R using the mapping plot functions with an SVG export. It then "cleans" up the svg by moving things around, adding attributes, and getting rid of some elements. I don't think it is worth understanding what is going on in this function as it is a dead end path we were on years ago before we changed how we do things. Because this file combines all of the other elements, it relies on cache/state-map.rds, cache/site-map.rds, cache/watermark.rds, and cache/bar-data.xml. Of those, we haven't covered the state-map or the watermark, but both should remain unchanged from previous builds and I figured you could use the old files w/o worry.

This did manage to build for me with updated time (I didn't update the actual data, but the original csv actually has some 2016, 2017, and 2018 sites in it). I made some changes to this function to make that happen.

If successful, that function will create an unstylized SVG that has extra bars for the additional years added:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant