Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Episode 9 updated by replacing calls to raster and rgdal packages #394

Merged
merged 2 commits into from
Mar 25, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
160 changes: 81 additions & 79 deletions episodes/09-vector-when-data-dont-line-up-crs.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,7 @@ source("setup.R")
::::::::::::::::::::::::::::::::::::::::::::::::::

```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
library(raster)
library(rgdal)
library(terra)
library(sf)
library(ggplot2)
library(dplyr)
Expand All @@ -33,21 +32,22 @@ library(dplyr)

## Things You'll Need To Complete This Episode

See the [lesson homepage](.) for detailed information about the software,
data, and other prerequisites you will need to work through the examples in this episode.
See the [lesson homepage](.) for detailed information about the software, data,
and other prerequisites you will need to work through the examples in this
episode.


::::::::::::::::::::::::::::::::::::::::::::::::::

In [an earlier episode](03-raster-reproject-in-r/)
we learned how to handle a situation where you have two different
files with raster data in different projections. Now we will apply
those same principles to working with vector data.
We will create a base map of our study site using United States
state and country boundary information accessed from the
we learned how to handle a situation where you have two different files with
raster data in different projections. Now we will apply those same principles
to working with vector data.
We will create a base map of our study site using United States state and
country boundary information accessed from the
[United States Census Bureau](https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html).
We will learn how to map vector data that are in different CRSs and thus
don't line up on a map.
We will learn how to map vector data that are in different CRSs and thus don't
line up on a map.

We will continue to work with the three shapefiles that we loaded in the
[Open and Plot Shapefiles in R](06-vector-open-shapefile-in-r/) episode.
Expand All @@ -57,55 +57,50 @@ We will continue to work with the three shapefiles that we loaded in the
aoi_boundary_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HarClip_UTMZ18.shp")
lines_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARV_roads.shp")
point_HARV <- st_read("data/NEON-DS-Site-Layout-Files/HARV/HARVtower_UTM18N.shp")
CHM_HARV <- raster("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif")
CHM_HARV <- rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/CHM/HARV_chmCrop.tif")
CHM_HARV_df <- as.data.frame(CHM_HARV, xy = TRUE)
roadColors <- c("blue", "green", "grey", "purple")[lines_HARV$TYPE]
```

## Working With Spatial Data From Different Sources

We often need to gather spatial datasets from
different sources and/or data that cover different spatial extents.
These data are often in
different Coordinate Reference Systems (CRSs).
We often need to gather spatial datasets from different sources and/or data
that cover different spatial extents.
These data are often in different Coordinate Reference Systems (CRSs).

Some reasons for data being in different CRSs include:

1. The data are stored in a particular CRS convention used by the data
provider (for example, a government agency).
2. The data are stored in a particular CRS that is customized to a region.
For instance, many states in the US prefer to use a State Plane projection customized
for that state.

![Maps of the United States using data in different projections. Source: opennews.org, from: https://media.opennews.org/cache/06/37/0637aa2541b31f526ad44f7cb2db7b6c.jpg](fig/map_usa_different_projections.jpg)

Notice the differences in shape associated with each different
projection. These differences are a direct result of the calculations
used to "flatten" the data onto a 2-dimensional map. Often data are
stored purposefully in a particular projection that optimizes the
relative shape and size of surrounding geographic boundaries (states,
counties, countries, etc).

In this episode we will learn how to identify and manage spatial data
in different projections. We will learn how to reproject the data so
that they
are in the same projection to support plotting / mapping. Note that
these skills
are also required for any geoprocessing / spatial analysis. Data need
to be in
1. The data are stored in a particular CRS convention used by the data provider
(for example, a government agency).
2. The data are stored in a particular CRS that is customized to a region. For
instance, many states in the US prefer to use a State Plane projection
customized for that state.

![Maps of the United States using data in different projections. Source: opennews.org, from: https://media.opennews.org/cache/06/37/0637aa2541b31f526ad44f7cb2db7b6c.jpg](fig/map_usa_different_projections.jpg){alt='Maps of the United States using data in different projections.}

Notice the differences in shape associated with each different projection.
These differences are a direct result of the calculations used to "flatten" the
data onto a 2-dimensional map. Often data are stored purposefully in a
particular projection that optimizes the relative shape and size of surrounding
geographic boundaries (states, counties, countries, etc).

In this episode we will learn how to identify and manage spatial data in
different projections. We will learn how to reproject the data so that they are
in the same projection to support plotting / mapping. Note that these skills
are also required for any geoprocessing / spatial analysis. Data need to be in
the same CRS to ensure accurate results.

We will continue to use the `sf` and `raster` packages in this episode.
We will continue to use the `sf` and `terra` packages in this episode.

## Import US Boundaries - Census Data

There are many good sources of boundary base layers that we can use to create a
basemap. Some R packages even have these base layers built in to support quick
and efficient mapping. In this episode, we will use boundary layers for the contiguous
United States, provided by the [United States Census Bureau](https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html).
It is useful to have shapefiles to work with because we can add
additional attributes to them if need be - for project specific
mapping.
and efficient mapping. In this episode, we will use boundary layers for the
contiguous United States, provided by the
[United States Census Bureau](https://www.census.gov/geo/maps-data/data/cbf/cbf_state.html).
It is useful to have shapefiles to work with because we can add additional
attributes to them if need be - for project specific mapping.

## Read US Boundary File

Expand All @@ -116,7 +111,8 @@ these data have been modified and reprojected from the original data downloaded
from the Census website to support the learning goals of this episode.

```{r read-shp}
state_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-State-Boundaries-Census-2014.shp")
state_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-State-Boundaries-Census-2014.shp") %>%
st_zm()
```

Next, let's plot the U.S. states data:
Expand All @@ -135,12 +131,13 @@ nicer. We will import
`NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States`.

```{r}
country_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States.shp")
country_boundary_US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/US-Boundary-Dissolved-States.shp") %>%
st_zm()
```

If we specify a thicker line width using `size = 2` for the border layer, it will
make our map pop! We will also manually set the colors of the state boundaries
and country boundaries.
If we specify a thicker line width using `size = 2` for the border layer, it
will make our map pop! We will also manually set the colors of the state
boundaries and country boundaries.

```{r us-boundaries-thickness}
ggplot() +
Expand All @@ -155,36 +152,34 @@ As we are adding these layers, take note of the CRS of each object.
First let's look at the CRS of our tower location object:

```{r crs-sleuthing-1}
st_crs(point_HARV)
st_crs(point_HARV)$proj4string
```

Our project string for `DSM_HARV` specifies the UTM projection as follows:

`+proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0`
`+proj=utm +zone=18 +datum=WGS84 +units=m +no_defs`

- **proj=utm:** the projection is UTM, UTM has several zones.
- **zone=18:** the zone is 18
- **datum=WGS84:** the datum WGS84 (the datum refers to the 0,0 reference for
the coordinate system used in the projection)
- **units=m:** the units for the coordinates are in METERS.
- **ellps=WGS84:** the ellipsoid (how the earth's roundness is calculated) for
the data is WGS84

Note that the `zone` is unique to the UTM projection. Not all CRSs
will have a
Note that the `zone` is unique to the UTM projection. Not all CRSs will have a
zone.

Let's check the CRS of our state and country boundary objects:

```{r crs-sleuthing-2}
st_crs(state_boundary_US)
st_crs(country_boundary_US)
st_crs(state_boundary_US)$proj4string
st_crs(country_boundary_US)$proj4string
```

Our project string for `state_boundary_US` and `country_boundary_US` specifies
the lat/long projection as follows:

`+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0`
`+proj=longlat +datum=WGS84 +no_defs`


- **proj=longlat:** the data are in a geographic (latitude and longitude)
coordinate system
Expand All @@ -194,15 +189,15 @@ the lat/long projection as follows:
is WGS84

Note that there are no specified units above. This is because this geographic
coordinate reference system is in latitude and longitude which is most
often recorded in decimal degrees.
coordinate reference system is in latitude and longitude which is most often
recorded in decimal degrees.

::::::::::::::::::::::::::::::::::::::::: callout

## Data Tip

the last portion of each `proj4` string
is `+towgs84=0,0,0 `. This is a conversion factor that is used if a datum
the last portion of each `proj4` string could potentially be something like
`+towgs84=0,0,0 `. This is a conversion factor that is used if a datum
conversion is required. We will not deal with datums in this episode series.


Expand Down Expand Up @@ -237,21 +232,22 @@ represented in meters.
- [Official PROJ library documentation](https://proj4.org/)
- [More information on the proj4 format.](https://proj.maptools.org/faq.html)
- [A fairly comprehensive list of CRSs by format.](https://spatialreference.org)
- To view a list of datum conversion factors type: `projInfo(type = "datum")`
into the R console.
- To view a list of datum conversion factors type:
`sf_proj_info(type = "datum")` into the R console. However, the results would
depend on the underlying version of the PROJ library.


::::::::::::::::::::::::::::::::::::::::::::::::::

## Reproject Vector Data or No?

We saw in [an earlier episode](03-raster-reproject-in-r/) that when working with raster
data in different CRSs, we needed to convert all objects to the same
CRS. We can do the same thing with our vector data - however, we
don't need to! When using the `ggplot2` package, `ggplot`
automatically converts all objects to the same CRS before plotting.
This means we can plot our three data sets together
without doing any conversion:
We saw in [an earlier episode](03-raster-reproject-in-r/) that when working
with raster data in different CRSs, we needed to convert all objects to the
same CRS. We can do the same thing with our vector data - however, we don't
need to! When using the `ggplot2` package, `ggplot` automatically converts all
objects to the same CRS before plotting.
This means we can plot our three data sets together without doing any
conversion:

```{r layer-point-on-states}
ggplot() +
Expand All @@ -268,24 +264,30 @@ ggplot() +

Create a map of the North Eastern United States as follows:

1. Import and plot `Boundary-US-State-NEast.shp`. Adjust line width as necessary.
2. Layer the Fisher Tower (in the NEON Harvard Forest site) point location `point_HARV` onto the plot.
1. Import and plot `Boundary-US-State-NEast.shp`. Adjust line width as
necessary.
2. Layer the Fisher Tower (in the NEON Harvard Forest site) point location
`point_HARV` onto the plot.
3. Add a title.
4. Add a legend that shows both the state boundary (as a line) and
the Tower location point.
4. Add a legend that shows both the state boundary (as a line) and the Tower
location point.

::::::::::::::: solution

## Answers

```{r ne-states-harv}
NE.States.Boundary.US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/Boundary-US-State-NEast.shp")
NE.States.Boundary.US <- st_read("data/NEON-DS-Site-Layout-Files/US-Boundary-Layers/Boundary-US-State-NEast.shp") %>%
st_zm()

ggplot() +
geom_sf(data = NE.States.Boundary.US, aes(color ="color"), show.legend = "line") +
scale_color_manual(name = "", labels = "State Boundary", values = c("color" = "gray18")) +
geom_sf(data = NE.States.Boundary.US, aes(color ="color"),
show.legend = "line") +
scale_color_manual(name = "", labels = "State Boundary",
values = c("color" = "gray18")) +
geom_sf(data = point_HARV, aes(shape = "shape"), color = "purple") +
scale_shape_manual(name = "", labels = "Fisher Tower", values = c("shape" = 19)) +
scale_shape_manual(name = "", labels = "Fisher Tower",
values = c("shape" = 19)) +
ggtitle("Fisher Tower location") +
theme(legend.background = element_rect(color = NA)) +
coord_sf()
Expand Down