-
Notifications
You must be signed in to change notification settings - Fork 12
/
09-Activity-Point-Pattern-Analysis-I.Rmd
149 lines (107 loc) · 7.5 KB
/
09-Activity-Point-Pattern-Analysis-I.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
---
title: "Activity 4: Point Pattern Analysis I"
output: html_notebook
---
# Activity 4: Point Pattern Analysis I
Remember, you can download the source file for this activity from [here](https://github.com/paezha/Spatial-Statistics-Course).
## Practice questions
Answer the following questions:
1. What is a random process?
2. What is a deterministic process?
3. What is a stochastic process?
4. What is a pattern?
5. What is the usefulness of a null landscape?
## Learning objectives
In this activity, you will:
1. Use the concept of quadrats to analyze a real dataset.
2. Learn about a quadrat-based test for randomness in point patterns.
3. Learn how to use the p-value of a statistical test to make a decision.
4. Think about the distribution of events in a null landscape.
5. Think about ways to decide whether a landscape is random.
## Suggested reading
O'Sullivan D and Unwin D (2010) Geographic Information Analysis, 2nd Edition, Chapter 5. John Wiley & Sons: New Jersey.
## Preliminaries
For this activity you will need the following:
* An R markdown notebook version of this document (the source file).
* A package called `geog4ga3`.
It is good practice to clear the working space to make sure that you do not have extraneous items there when you begin your work. The command in R to clear the workspace is `rm` (for "remove"), followed by a list of items to be removed. To clear the workspace from _all_ objects, do the following:
```{r}
rm(list = ls())
```
Note that `ls()` lists all objects currently on the workspace.
Load the libraries you will use in this activity. In addition to `tidyverse`, you will need `spatstat`, a package designed for the analysis of point patterns (you can learn about `spatstat` [here](https://cran.r-project.org/web/packages/spatstat/vignettes/getstart.pdf) and [here](http://spatstat.org/resources/spatstatJSSpaper.pdf)):
```{r message=FALSE}
library(tidyverse)
library(spatstat)
library(maptools) # Needed to convert a `Spatial Polygons` object into an `owin` object
library(sf)
library(geog4ga3)
```
In the practice that preceded this activity, you learned about the concepts of intensity and density, about quadrats, and also how to create density maps.
Begin by loading the data that you will use in this activity:
```{r}
data("Fast_Food")
data("Gas_Stands")
data("Paez_Mart")
```
Next the geospatial files need to be read. For this example, the city boundary of Toronto is provided in two different formats, as a dataframe (which can be used to plot using `ggplot2`) and as a `SpatialPolygons` object, a format widely used in R for spatial analysis. The :
```{r}
data("Toronto")
```
If you inspect your workspace, you will see that the following dataframes are there:
* `Fast_Food`
* `Gas_Stands`
* `Paez_Mart`
These are locations of a selection of fast food restaurants, and also of gas stands in Toronto (data are from 2008). Paez Mart on the other hand is a project to cover Toronto with convenience stores. The points are the planned locations of the stores.
Also, there should be an object of class `sf`. This dataframe contains the city boundary of Toronto:
```{r}
class(Toronto)
```
Try plotting the following:
```{r}
ggplot() +
geom_sf(data = Toronto, color = "black", fill = NA, alpha = 1, size = .3) +
geom_sf(data = Paez_Mart) +
coord_sf()
```
As discussed in the preceding chapter, the package `spatstat` offers a very rich collection of tools to do point pattern analysis. To convert the three sets of events (i.e., the fast food establishments, gas stands, and Paez Mart) into `ppp` objects we first must define a region or _window_. To do this we take the `sf` and convert to an `owin` (a window object) for use with the package `spatstat` (this is done via `SpatialPolygons`, hence `as(x, "Spatial")`:
```{r}
# `as.owin()` will take a "foreign" object (foreign to `spatstat`) and convert it into an `owin` object. Here, there are two steps involved: first, we take the `sf` object with the boundaries of Toronto and convert it into a "Spatial" object, and then the "Spatial" object is passed on to `as.owin()`
Toronto.owin <- as(Toronto, "Spatial") %>% as.owin() # Requires `maptools` package
```
And, then convert the dataframes to `ppp` objects (this necessitates that we extract the coordinates of the events by means of `st_coordinates`):
```{r}
Fast_Food.ppp <- as.ppp(st_coordinates(Fast_Food), W = Toronto.owin)
Gas_Stands.ppp <- as.ppp(st_coordinates(Gas_Stands), W = Toronto.owin)
Paez_Mart.ppp <- as.ppp(st_coordinates(Paez_Mart), W = Toronto.owin)
```
These objects can now be used with the functions of the `spatstat` package. For instance, you can calculate the counts of events by quadrat by means of `quadrat.count`. The input must be a `ppp` object, and the number of quadrats on the horizontal (nx) and vertical (ny) direction (notice how I use the function `table` to present the frequency of quadrats with number of events):
```{r}
q_count <- quadratcount(Fast_Food.ppp, nx = 3, ny = 3)
table(q_count)
```
As you see from the table, there is one quadrat with zero events, one quadrat with six events, one quadrat with forty-four events, and so on.
You can also plot the results of the `quadratcount()` function!
```{r}
plot(q_count)
```
A useful function in the `spatstat` package is `quadrat.test`. This function implements a statistical test that compares the empirical distribution of events by quadrats to the distribution of events as expected under the hypothesis that the underlying process is _random_.
This is implemented as follows:
```{r}
q_test <- quadrat.test(Fast_Food.ppp, nx = 3, ny = 3)
q_test
```
The quadrat test reports a $p$-value which can be used to make a decision. The $p$-value is the probability that you will be mistaken if you reject the null hypothesis. To make a decision, you need to know what is the null hypothesis, and your own tolerance for making a mistake. In the case above, the $p$-value is very, very small (2.2e-16 = 0.00000000000000022). Since the null hypothesis is spatial randomness, you can reject this hypothesis and the probability that this decision is mistaken is vanishingly small.
Try plotting the results of `quadrat.test`:
```{r}
plot(q_test)
```
Now that you have seen how to do some analysis using quadrats, you are ready for the next activity.
## Activity
**NOTE**: Activities include technical "how to" tasks/questions. Usually, these ask you to organize data, create a plot, and so on in support of analysis and interpretation. These tasks are indicated by a star (*).
1. (*)Use `Fast_Food`, `Gas_Stands`, `Paez_Mart`, and `Toronto` to create density maps for the three point patterns. Select a quadrat size that you think is appropriate.
2. (*)Use `Fast_Food.ppp`, `Gas_Stands`, and `Paez_Mart`, and the function `quadratcount` to calculate the number of events per quadrat. Remember that you need to select the number of quadrats in the horizontal and vertical directions!
3. (*)Use the function `table()` to examine the frequency of events per quadrat for each of the point patterns.
4. Show your density maps to a fellow student. Did they select the same quadrat size? If not, what was their rationale for their size?
5. Again, use the function `table()` to examine the frequency of events per quadrat for each of the point patterns. What are the differences among these point patterns? What would you expect the frequency of events per quadrat to be in a null landscape?
6. Use `Fast_Food.ppp`, `Gas_Stands`, and `Paez_Mart`, and the function `quadrat.test` to calculate the test of spatial independence for these point patterns. What is your decision in each case? Explain.