-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathMW_Predicted-Floweriness.Rmd
273 lines (212 loc) · 11.7 KB
/
MW_Predicted-Floweriness.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
---
title: "Calculate Flowering Indices from MeadoWatch Model Fits"
author: "Janneke Hille Ris Lambers, Aji John, Meera Sethi, Elli Theobald"
date: "13/01/2021"
output: html_document
---
#Goal / Objective
The goal of this R markdown is to estimate, per DOY, various metrics correlated to the flowering season. This file takes as input parameter fits from phenological curves fit to MeadoWatch data (see *Peak-Wildflower-Season.Rmd*)
#Setup for R script and R markdown
1. Load all libraries
2. Specify string behavior
3. Load packages
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
options(stringsAsFactors = FALSE)
library(boot)
```
#Read in necessary data: parameter fits and trail specific SDD data
Parameter fits are from maximum likelihood models fit to a subset of MeadoWatch data (see *MW_Modelfitting(byspecies).Rmd*). The snow disappearance data were measured using hobos, modeled or observed on the MeadoWatch trails (these observations are used to describe the snow dynamics of each year / trail). The following needs to be read in:
1. *output/Parameters_plotmodel.csv*: includes peak, range and max parameters (for a unimodal curve describing the relationship between DOY and flowering probability), in each year, trail, plot in which species were observed. In other words, this model was separately fit to year-trail-plot-species specific data.
2. A second set of models were fit to each species data. This model assumed the same functional form - with the relationship between DOY and flowering probability in each plot determined by 4 parameters (in *output/Parameters_speciesmodel.csv*. In this model, however, the following was assumed:
+ A. The peak parameter varied with plot specific snow duration. The slope and intercept parameters describing this relationship are the first two parameters.
+ B. The range / duration parameter describing phenological curves are assumed to be constant across species (fit across all years / plots)
3. *cleandata/MW_SDDall.csv* includes observed and predicted snow disappearance data collected from each trail / plot.
4. Also read in the necessary phenological functions (*HRL_phenology_functions.R*)
```{r}
# read in functions to calculate trail-wide flowering
source("./HRL_phenology_functions.R")
# Read in phenology data, station data
PlotPheno_pars <- read.csv("output/Parameters_plotmodel.csv", header=TRUE)
Species_pars <- read.csv("output/Parameters_speciesmodel.csv", header=TRUE)
SDDdat_0 <- read.csv("cleandata/MW_SDDall.csv", header=TRUE)
SiteInfo <- read.csv("data/MW_SiteInfo_2013_2020.csv", header=TRUE)
# Merge SDDdat_0 with SiteInfo
SDDdat_1 <- merge(SDDdat_0, SiteInfo, by = c("Site_Loc"))
# Create new SDD - observed when present, predicted when not
SDDfin <- SDDdat_1$SDD
# This substitutes in predicted SDD where sensors failed = NA
SDDfin[is.na(SDDdat_1$SDD)==TRUE] <-
round(SDDdat_0$predSDD[is.na(SDDdat_1$SDD)==TRUE],0)
# this removes unwanted columns, including SDD and pred SDD
SDDdat_2 <- subset(SDDdat_1, select=-c(Site_Num, Site_Loc, calibration,
snow_appearance_date,
snow_disappearance_date,
snow_cover_duration, minimum_soil_temp,
SDD, predSDD, Latitude,
Longitude, Elevation))
# This adds SDDfin back to data frame as SDD
SDDdat_2$SDD <- SDDfin
# Subset so Forest Types aren't included
SDDdat <- SDDdat_2[SDDdat_2$Type!="Forest",]
# Determine unique species, years included in parameters
species <- unique(Species_pars$species)
nspp <- length(species)
yrs <- unique(PlotPheno_pars$year); yrs <- yrs[order(yrs)]
nyrs <- length(yrs)
trails <- c("Glacier Basin", "Reflection Lakes")
# Reminder of species, # years included
print(paste("Data examines phenology of", nspp, "species", sep=" "))
print("Species include:"); print(species)
print(paste("Data includes observations from", nyrs, "years", sep=" "))
print("Years of data included:"); print(yrs)
#Examine data frames
head(PlotPheno_pars)
head(Species_pars)
head(SDDdat)
#Write
write.csv(SDDdat, "output/MeadowSDD.csv", quote=FALSE,
row.names=FALSE)
```
#Calculate trail wide probability of observing flowers and flowering richness
This chunk of code estimates the probability of observing flowering of focal species along each MeadoWatch trail (RL and GB) for each year, between March 15 and October 15 (7 months). It does so using fitted models relating peak flowering to snow disappearance per species. We then make some assumptions to calculate the probability of observing flowering and flowering richness in the following way:
1. For each trail and year, we calculate the earliest and latest snowmelt observed that year (assumptions, this a reasonable proxy for range of snowmelt observed along the entire trail).
2. For each trail, year and focal species on the trail, we calculate the following parameters:
+ A. Earliest peak flowering and latest peakflowering (*peakearly* and *peaklate*) on the trail, from the relationship between SDD and peak flowering (in *Species_pars*).
+ B. Also extract year-specific range and max parameters from the same file.
3. Year, species and trail wide probability of flowering by DOY is calculated as follows:
+ DOY < *peakearly*; predict using *peakearly* (on the trail in question), *range* and *max*
+ DOY >= *peakearly* and <= *peaklate*, probability of flowering is inv.logit(*max*). This assumes that somewhere on the trail the visitor is observing flowering at the maximum probability.
+ DOY > *peaklate*, predict using *peaklate* (on the trail in question), *range* and *max*
4. Species-specific flowering probabilites can be used to calculate the probability of observing all focal species flowering, and the probability of observing at least one focal species flowering.
5. Flowering richness is also calculated, as follows:
+ Per species, when flowering probability is > 0.5*max, flowering = yes; < = no
+ Sum all the yes's per DOY per trail to calculate richness
*Note that an issue with this is that we are assuming that yes / no observations of a specific species in a 2 x 1 meter plot correlate with abundance / number of flowers in meadows along the trail.*
```{r}
DOY_pred <- seq(105,285) #approximate April 15 - October 15
output_p <- c()
output_r <- c()
f_thresh <- 0.5 #between 0 and 1, modifies at what proportion of max flowering,
# for example, at 0.5, implies than when 50% max close to peak
for(i in 1:nyrs){ #each year
outputGB <- cbind(rep(yrs[i], times=length(DOY_pred)),
rep("GB", times=length(DOY_pred)), DOY_pred)
outputRL <- cbind(rep(yrs[i], times=length(DOY_pred)),
rep("RL", times=length(DOY_pred)), DOY_pred)
dimnames(outputGB) <- list(c(), c("year","trail","DOY"))
dimnames(outputRL) <- list(c(), c("year","trail","DOY"))
outputRL_yr_p <- outputRL; outputGB_yr_p <- outputGB
outputRL_yr_r <- outputRL; outputGB_yr_r <- outputGB
for(j in 1:2){ #each trail
#Extract max and min SDD
SDD_obs <- SDDdat$SDD[SDDdat$Year==yrs[i] &
SDDdat$Transect==trails[j]]
if(length(SDD_obs)==0){next} #break if no obs - GB 2013, 2014
SDD_max <- max(SDD_obs); SDD_min <- min(SDD_obs)
for(k in 1:length(species)){
#determine early, late peak
int_spp <- Species_pars$peakint[Species_pars$species==species[k]]
slp_spp <- Species_pars$peakSDDslope[Species_pars$species
==species[k]]
peakearly <- int_spp + slp_spp*SDD_min
peaklate <- int_spp + slp_spp*SDD_max
#determine max, range
max <- Species_pars$max[Species_pars$species==species[k]]
rng <- Species_pars$range[Species_pars$species==species[k]]
#calculate predflower
param <- c(peakearly, peaklate, rng, max)
ptrailf <- predflowertrail(DOY_pred,param)
ptrailr <- ptrailf
ptrailr[ptrailf>=f_thresh*max(ptrailf)] <- 1
ptrailr[ptrailf<f_thresh*max(ptrailf)] <- 0
#Now add tooutput files
if(j==1){
outputGB_yr_p <- cbind(outputGB_yr_p, ptrailf)
dimnames(outputGB_yr_p)[[2]][k+3] <- species[k]
outputGB_yr_r <- cbind(outputGB_yr_r, ptrailr)
dimnames(outputGB_yr_r)[[2]][k+3] <- species[k]
}
if(j==2){
outputRL_yr_p <- cbind(outputRL_yr_p, ptrailf)
dimnames(outputRL_yr_p)[[2]][k+3] <- species[k]
outputRL_yr_r <- cbind(outputRL_yr_r, ptrailr)
dimnames(outputRL_yr_r)[[2]][k+3] <- species[k]
}
}
if(j==1 & yrs[i]>2014){
output_p <- rbind(output_p,outputGB_yr_p)
output_r <- rbind(output_r,outputGB_yr_r)}
if(j==2){output_p <- rbind(output_p,outputRL_yr_p)
output_r <- rbind(output_r,outputRL_yr_r)}
}
}
#change output_p and output_r to data frames
output_p <- data.frame(output_p)
output_r <- data.frame(output_r)
for(i in 3:dim(output_p)[2]){
output_p[,i] <- as.numeric(output_p[,i])
output_r[,i] <- as.numeric(output_r[,i])
}
#Calculate probability flowering indices
#probability all focal species flower
allfl <- apply(output_p[,4:dim(output_p)[2]], 1, prod)
#probability any focal species flowers
pnofl <- 1-output_p[,4:dim(output_p)[2]]
anyfl <- 1-apply(pnofl, 1, prod)
#append
output_p_fin <- data.frame(cbind(output_p, anyfl,allfl))
#Calculate richness
rich <- apply(output_r[,4:dim(output_r)[2]], 1, sum)
flseason <- rep(0, times=length(rich))
flseason[rich[]>=0.5*length(species)] <- 1
output_r_fin <- data.frame(cbind(output_r, rich, flseason))
#Examine output
head(output_p_fin)
head(output_r_fin)
#write output
write.csv(output_p_fin, "output/FloweringProbabilities.csv", quote=FALSE,
row.names=FALSE)
write.csv(output_r_fin, "output/FloweringRichness.csv", quote=FALSE,
row.names=FALSE)
```
#Visualize flowering richness for all years, on both transects
```{r}
#Set up plot
par(mfrow=c(1,2), omi=c(0,0,0,0), mai=c(0.5,0.4,0.4,0.2),
tck=-0.01, mgp=c(1.25,0.25,0), xpd=TRUE)
yrcols <- c("chartreuse","cyan","darkorange","darkorchid1","darkgreen",
"deeppink3","cornflowerblue")
#first plot Reflection lakes
trldat_r <- output_r_fin[output_r_fin$trail=="RL",]
maxspp <- max(output_r_fin$rich)
for(i in 2013:max(yrs)){
yrdat_r <- trldat_r[trldat_r$year==i,]
#graph richness: create if 2013, add lines if later years
if(i==2013){
plot(yrdat_r$DOY,yrdat_r$rich, type="l", col=yrcols[i-2012],
lwd=2, xaxt="n",xlab="DOY", ylab="flowering richness",
main=paste("RL - Richness", sep=" "), ylim=c(0,maxspp));
text(c(105,135,165,195,225,255,285),-1*maxspp/12,
labels=c("Apr","May","Jun","Jul","Aug","Sep","Oct"))}
if(i>2013){
lines(yrdat_r$DOY,yrdat_r$rich, type="l", col=yrcols[i-2012],lwd=2)}
}
#Now Glacier Basin
trldat_r <- output_r_fin[output_r_fin$trail=="GB",]
maxspp <- max(trldat_r$rich)
for(i in 2015:max(yrs)){
yrdat_r <- trldat_r[trldat_r$year==i,]
#graph richness: create if 2013, add lines if later years
if(i==2015){
plot(yrdat_r$DOY,yrdat_r$rich, type="l", col=yrcols[i-2012],
lwd=2, xaxt="n",xlab="DOY", ylab="Flowering richness",
main=paste("GB - Richness", sep=" "), ylim=c(0,maxspp));
text(c(105,135,165,195,225,255,285),-1,
labels=c("Apr","May","Jun","Jul","Aug","Sep","Oct"))}
if(i>2015){
lines(yrdat_r$DOY,yrdat_r$rich, type="l", col=yrcols[i-2012],lwd=2)}
}
legend(x="topleft",legend=c(2013:2019), cex=0.75,
col=yrcols, lwd=2)
```