Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lesson 1 updated by replacing calls to raster and rgdal packages #384

Merged
merged 5 commits into from
Mar 10, 2023
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 57 additions & 42 deletions episodes/01-raster-structure.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ knitr::opts_chunk$set(fig.height = 6)

- Describe the fundamental attributes of a raster dataset.
- Explore raster attributes and metadata using R.
- Import rasters into R using the `raster` package.
- Import rasters into R using the `terra` package.
- Plot a raster file in R using the `ggplot2` package.
- Describe the difference between single- and multi-band rasters.

Expand All @@ -29,8 +29,7 @@ knitr::opts_chunk$set(fig.height = 6)
::::::::::::::::::::::::::::::::::::::::::::::::::

```{r load-libraries, echo=FALSE, results="hide", message=FALSE, warning=FALSE}
library(raster)
library(rgdal)
library(terra)
library(ggplot2)
library(dplyr)
```
Expand All @@ -53,11 +52,10 @@ data values as stored in a raster and how R handles these elements.

We will continue to work with the `dplyr` and `ggplot2` packages that were introduced
in the [Introduction to R for Geospatial Data](https://datacarpentry.org/r-intro-geospatial/) lesson. We will use two additional packages in this episode to work with raster data - the
`raster` and `rgdal` packages. Make sure that you have these packages loaded.
`raster` and `sf` packages. Make sure that you have these packages loaded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this text specifying the raster package be replaced with terra?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I changed the text from raster to terra. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution on transitioning to terra. A few notes below on updating text content for learners to follow along with updated code will be very helpful:

Line 84 references GDAL() function that has now been removed and should be updated to reflect describe()
Line 151 references raster() and should be updated to reflect the forcing of dataframe creation
Lines 319-332 references raster() and nlayers()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reporting the typos. I fixed them.


```{r load-libraries-2, eval=FALSE}
library(raster)
library(rgdal)
library(terra)
library(ggplot2)
library(dplyr)
```
Expand Down Expand Up @@ -86,14 +84,14 @@ data before we read that data into R. It is ideal to do this before importing
your data.

```{r view-attributes-gdal}
GDALinfo("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
```

If you wish to store this information in R, you can do the following:

```{r}
HARV_dsmCrop_info <- capture.output(
GDALinfo("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
)
```

Expand All @@ -104,7 +102,7 @@ episode. By the end of this episode, you will be able to explain and understand
## Open a Raster in R

Now that we've previewed the metadata for our GeoTIFF, let's import this
raster dataset into R and explore its metadata more closely. We can use the `raster()`
raster dataset into R and explore its metadata more closely. We can use the `rast()`
function to open a raster in R.

::::::::::::::::::::::::::::::::::::::::: callout
Expand All @@ -123,7 +121,7 @@ First we will load our raster file into R and view the data structure.

```{r}
DSM_HARV <-
raster("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")

DSM_HARV
```
Expand All @@ -138,16 +136,13 @@ summary(DSM_HARV)

but note the warning - unless you force R to calculate these statistics using
every cell in the raster, it will take a random sample of 100,000 cells and
calculate from that instead. To force calculation on more, or even all values,
you can use the parameter `maxsamp`:
calculate from that instead. To force calculation all the values, you can use
the function `values`:

```{r}
summary(DSM_HARV, maxsamp = ncell(DSM_HARV))
summary(values(DSM_HARV))
```

You may not see major differences in summary stats as `maxsamp` increases,
except with very large rasters.

To visualise this data in R using `ggplot2`, we need to convert it to a
dataframe. We learned about dataframes in [an earlier
lesson](https://datacarpentry.org/r-intro-geospatial/04-data-structures-part2/index.html).
Expand All @@ -164,8 +159,12 @@ dataframe format.
str(DSM_HARV_df)
```

We can use `ggplot()` to plot this data. We will set the color scale to `scale_fill_viridis_c`
which is a color-blindness friendly color scale. We will also use the `coord_quickmap()` function to use an approximate Mercator projection for our plots. This approximation is suitable for small areas that are not too close to the poles. Other coordinate systems are available in ggplot2 if needed, you can learn about them at their help page `?coord_map`.
We can use `ggplot()` to plot this data. We will set the color scale to
`scale_fill_viridis_c` which is a color-blindness friendly color scale. We will
also use the `coord_quickmap()` function to use an approximate Mercator
projection for our plots. This approximation is suitable for small areas that
are not too close to the poles. Other coordinate systems are available in
ggplot2 if needed, you can learn about them at their help page `?coord_map`.

```{r ggplot-raster, fig.cap="Raster plot with ggplot2 using the viridis color scale"}

Expand All @@ -189,7 +188,7 @@ More information about the Viridis palette used above at

## Plotting Tip

For faster, simpler plots, you can use the `plot` function from the `raster` package.
For faster, simpler plots, you can use the `plot` function from the `terra` package.

::::::::::::::: solution

Expand Down Expand Up @@ -222,7 +221,7 @@ We can view the CRS string associated with our R object using the`crs()`
function.

```{r view-resolution-units}
crs(DSM_HARV)
crs(DSM_HARV, proj = TRUE)
```

::::::::::::::::::::::::::::::::::::::: challenge
Expand Down Expand Up @@ -254,7 +253,8 @@ and datum (`datum=`).

### UTM Proj4 String

Our projection string for `DSM_HARV` specifies the UTM projection as follows:
A projection string (like the one of `DSM_HARV`) specifies the UTM projection
as follows:

`+proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0`

Expand All @@ -269,7 +269,8 @@ Our projection string for `DSM_HARV` specifies the UTM projection as follows:
Note that the zone is unique to the UTM projection. Not all CRSs will have a
zone. Image source: Chrismurf at English Wikipedia, via [Wikimedia Commons](https://en.wikipedia.org/wiki/Universal_Transverse_Mercator_coordinate_system#/media/File:Utm-zones-USA.svg) (CC-BY).

![The UTM zones across the continental United States. From: https://upload.wikimedia.org/wikipedia/commons/8/8d/Utm-zones-USA.svg](fig/Utm-zones-USA.svg)

![The UTM zones across the continental United States. From: https://upload.wikimedia.org/wikipedia/commons/8/8d/Utm-zones-USA.svg](fig/Utm-zones-USA.svg){alt='UTM zones in the USA.'}

## Calculate Raster Min and Max Values

Expand All @@ -281,9 +282,11 @@ Raster statistics are often calculated and embedded in a GeoTIFF for us. We
can view these values:

```{r view-min-max}
minValue(DSM_HARV)
minmax(DSM_HARV)

maxValue(DSM_HARV)
min(values(DSM_HARV))

max(values(DSM_HARV))
```

::::::::::::::::::::::::::::::::::::::::: callout
Expand All @@ -300,8 +303,8 @@ DSM_HARV <- setMinMax(DSM_HARV)

::::::::::::::::::::::::::::::::::::::::::::::::::

We can see that the elevation at our site ranges from `r raster::minValue(DSM_HARV)`m to
`r raster::maxValue(DSM_HARV)`m.
We can see that the elevation at our site ranges from `r min(terra::values(DSM_HARV))`m to
`r max(terra::values(DSM_HARV))`m.

## Raster Bands

Expand All @@ -316,7 +319,7 @@ function to import one single band from a single or multi-band raster. We can
view the number of bands in a raster using the `nlayers()` function.

```{r view-raster-bands}
nlayers(DSM_HARV)
nlyr(DSM_HARV)
```

However, raster data can also be multi-band, meaning that one raster file
Expand All @@ -343,7 +346,7 @@ did not collect data in these areas.
# no data demonstration code - not being taught
# Use stack function to read in all bands
RGB_stack <-
stack("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif")
rast("data/NEON-DS-Airborne-Remote-Sensing/HARV/RGB_Imagery/HARV_RGB_Ortho.tif")

# aggregate cells from 0.25m to 2m for plotting to speed up the lesson and
# save memory
Expand Down Expand Up @@ -389,19 +392,25 @@ ggplot() +

```

If your raster already has `NA` values set correctly but you aren't sure where they are, you can deliberately plot them in a particular colour. This can be useful when checking a dataset's coverage. For instance, sometimes data can be missing where a sensor could not 'see' its target data, and you may wish to locate that missing data and fill it in.
If your raster already has `NA` values set correctly but you aren't sure where
they are, you can deliberately plot them in a particular colour. This can be
useful when checking a dataset's coverage. For instance, sometimes data can be
missing where a sensor could not 'see' its target data, and you may wish to
locate that missing data and fill it in.

To highlight `NA` values in ggplot, alter the `scale_fill_*()` layer to contain a colour instruction for `NA` values, like `scale_fill_viridis_c(na.value = 'deeppink')`
To highlight `NA` values in ggplot, alter the `scale_fill_*()` layer to contain
a colour instruction for `NA` values, like `scale_fill_viridis_c(na.value = 'deeppink')`

```{r, echo=FALSE}
# demonstration code
# function to replace 0 with NA where all three values are 0 only
RGB_2m_nas <- calc(RGB_2m,
fun = function(x) {
x[rowSums(x == 0) == 3, ] <- rep(NA, nlayers(RGB_2m))
x
})
RGB_2m_nas <- as.data.frame(RGB_2m_nas, xy = TRUE)
RGB_2m_nas <- app(RGB_2m,
fun = function(x) {
if (sum(x == 0, na.rm = TRUE) == length(x))
return(rep(NA, times = length(x)))
x
})
RGB_2m_nas <- as.data.frame(RGB_2m_nas, xy = TRUE, na.rm = FALSE)

ggplot() +
geom_raster(data = RGB_2m_nas, aes(x = x, y = y, fill = HARV_RGB_Ortho_3)) +
Expand Down Expand Up @@ -436,14 +445,15 @@ of `NA` will be ignored by R as demonstrated above.

## Challenge

Use the output from the `GDALinfo()` function to find out what `NoDataValue` is used for our `DSM_HARV` dataset.
Use the output from the `describe()` and `sources()` functions to find out what
`NoDataValue` is used for our `DSM_HARV` dataset.

::::::::::::::: solution

## Answers

```{r}
GDALinfo("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_dsmCrop.tif")
describe(sources(DSM_HARV))
```

`NoDataValue` are encoded as -9999.
Expand Down Expand Up @@ -482,15 +492,20 @@ elevation values over 400m with a contrasting colour.

```{r demo-bad-data-highlighting, echo=FALSE, message=FALSE, warning=FALSE}
# reclassify raster to ok/not ok
DSM_highvals <- reclassify(DSM_HARV, rcl = c(0, 400, NA_integer_, 400, 420, 1L), include.lowest = TRUE)
DSM_highvals <- classify(DSM_HARV,
rcl = matrix(c(0, 400, NA_integer_, 400, 420, 1L),
ncol = 3, byrow = TRUE),
include.lowest = TRUE)
DSM_highvals <- as.data.frame(DSM_highvals, xy = TRUE)

DSM_highvals <- DSM_highvals[!is.na(DSM_highvals$HARV_dsmCrop), ]

ggplot() +
geom_raster(data = DSM_HARV_df, aes(x = x, y = y, fill = HARV_dsmCrop)) +
scale_fill_viridis_c() +
# use reclassified raster data as an annotation
annotate(geom = 'raster', x = DSM_highvals$x, y = DSM_highvals$y, fill = scales::colour_ramp('deeppink')(DSM_highvals$HARV_dsmCrop)) +
annotate(geom = 'raster', x = DSM_highvals$x, y = DSM_highvals$y,
fill = scales::colour_ramp('deeppink')(DSM_highvals$HARV_dsmCrop)) +
ggtitle("Elevation Data", subtitle = "Highlighting values > 400m") +
coord_quickmap()

Expand Down Expand Up @@ -532,7 +547,7 @@ no bad data values in this particular raster.

> ## Challenge: Explore Raster Metadata
>
> Use `GDALinfo()` to determine the following about the `NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_DSMhill.tif` file:
> Use `describe()` to determine the following about the `NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV_DSMhill.tif` file:
>
> 1. Does this file have the same CRS as `DSM_HARV`?
> 2. What is the `NoDataValue`?
Expand All @@ -547,7 +562,7 @@ no bad data values in this particular raster.
> > ```{r challenge-code-attributes}
> > ```

GDALinfo("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV\_DSMhill.tif")
describe("data/NEON-DS-Airborne-Remote-Sensing/HARV/DSM/HARV\_DSMhill.tif")

::::::::::::::::::::::::::::::::::::::: challenge

Expand Down