Skip to content

Commit

Permalink
minor updates
Browse files Browse the repository at this point in the history
  • Loading branch information
Sidhuharp97 committed Aug 9, 2024
1 parent bf3fdaa commit ca830d3
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 27 deletions.
1 change: 1 addition & 0 deletions chapters/factorial-design.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ Instead of `summary()` function, we used `tidy()` function from 'broom.mixed' pa
check_model(model1)
```
The linearity and homogeneity of variance plots show no trend. The normal Q-Q plots for the overall residuals and for the random effects all fall nearly on a straight line so we can be satisfied with that.

### Inference
We can get an ANOVA table for the linear mixed model using the function `anova()`, which works for both `lmer()` and `lme()` models..
```{r}
Expand Down
41 changes: 24 additions & 17 deletions chapters/lattice-design.qmd
Original file line number Diff line number Diff line change
@@ -1,9 +1,32 @@
---
title: "lattice_design"
---
## Background
Yates (1936) proposed this method of arranging agricultural variety trials involving a large number of crop varieties. These types of arrangements were named a quasi-factorial or lattice designs. His paper contained numerical examples based on the results of a uniformity trial on orange trees. A special feature of lattice designs is that the number of treatments, t, is related to the block size, k, in one of three forms: t = k^2, t = k3, or t = k(k +1).

Even though the number of possible treatments is limited, a lattice design may be an ideal design for field experiments with a large number of treatments.


https://kwstat.github.io/agridat/reference/cochran.lattice.html


```{r}
# Two contiguous reps in 8 rows, 16 columns
libs(desplot)
desplot(dat, yield ~ col*row,
out1=rep, # aspect unknown
text=gen, shorten="none", cex=.75,
main="burgueno.rowcol")
```

Statistical model for lattice design:

$Yijk = \mu + \alpha_i + \gamma_j + \tau_t + \beta_k + \epsilon_ijk$

where, $mu is the µ is the experiment mean, 𝛽 is the row effect, 𝛾 is the column effect, and 𝜏 is the treatment effect.


```{r}
library(dplyr)
library(nlme)
Expand All @@ -13,6 +36,7 @@ library(performance)
library(lme4); library(lmerTest); library(emmeans)
```


Import data from agridat package and create columns for row and column as factor variables. This is a balanced experiment design

```{r}
Expand All @@ -22,23 +46,6 @@ dat <- burgueno.rowcol
```

```{r}
# Two contiguous reps in 8 rows, 16 columns
libs(desplot)
desplot(dat, yield ~ col*row,
out1=rep, # aspect unknown
text=gen, shorten="none", cex=.75,
main="burgueno.rowcol")
```

Statistical model for lattice design:

$Yijk = \mu + \alpha_i + \gamma_j + \tau_t + \beta_k + \epsilon_ijk$

where, mu is the µ is the experiment mean, 𝛽 is the row effect, 𝛾 is the column effect, and 𝜏 is the treatment effect.

```{r}
#Random rep, row and col within rep
m1 <- lmer(yield ~ gen + (1|rep) + (1|rep:row) + (1|rep:col), data=dat)
Expand Down
13 changes: 3 additions & 10 deletions chapters/rcbd.qmd
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@

# Randomized Complete Block Design

```{r, echo=FALSE}
par(mar=c(5.1, 6, 4.1, 2.1))
```


This is a simple model that can serve as a good entrance point to mixed models.

It is very common design where experimental treatments are applied at random to experimental units within each block. The blocks are intended to control for a nuisance source of variation, such as over time, spatial variance, changes in equipment or operators, or myriad other causes.
Expand Down Expand Up @@ -61,7 +59,7 @@ var_trial <- read.csv(here::here("data", "aberdeen2015.csv"))
```

| | |
|-----------------|-------------------------------------------------------|
|------------------------|----------------------------------------------------------------------------------|
| block | blocking unit |
| range | column position for each plot |
| row | row position for each plot |
Expand Down Expand Up @@ -99,6 +97,7 @@ Next, check the independent variables. Running a cross tabulations is often suff
```{r}
table(var_trial$variety, var_trial$block)
```

There are 42 varieties and there appears to be no misspellings among them that might confuse R into thinking varieties are different when they are actually the same. R is sensitive to case and white space, which can make it easy to create near duplicate treatments, such as "eltan" and "Eltan" and "Eltan". There is no evidence of that in this data set. Additionally, it is perfectly balanced, with exactly one observation per treatment per rep. Please note that this does not tell us anything about the extent of missing data.

Here is a quick check I run to count the number of missing data in each column.
Expand All @@ -113,7 +112,6 @@ If there were independent variables with a continuous distribution (a covariate)

Last, check the dependent variable. A histogram is often quite sufficient to accomplish this. This is designed to be a quick check, so no need to spend time making the plot look good.


```{r, eval=FALSE}
hist(var_trial$yield_bu_a, main = "", xlab = "yield")
```
Expand Down Expand Up @@ -178,7 +176,6 @@ plot(model_rcbd, resid(., scaled=TRUE) ~ fitted(.),
xlab = "fitted values", ylab = "studentized residuals")
```


```{r, echo=FALSE}
#| label: fig-rcbd_error
#| fig-cap: "Plot of residuals versus fitted values"
Expand All @@ -193,7 +190,7 @@ plot(model_rcbd, resid(., scaled=TRUE) ~ fitted(.),
We are looking for a random and uniform distribution of points. This looks good!

::: column-margin
The same code works for nlme and lme4-generated models.
The same code works for nlme and lme4-generated models.
:::

Checking normality requiring first extracting the model residuals with `resid()` and then generating a qq-plot and line.
Expand All @@ -217,14 +214,12 @@ This is reasonably good. Things do tend to fall apart at the tails.

Estimates for each treatment level can be obtained with the 'emmeans' package.


```{r}
# same syntax for lme4 & nlme
rcbd_emm <- emmeans(model_rcbd, ~ variety)
as.data.frame(rcbd_emm) %>% arrange(desc(emmean))
```


This table indicates the estimated marginal means ("emmean", sometimes called "least squares means"), the standard error ("SE") of those means, the degrees of freedom and the upper and lower bounds of the 95% confidence interval. As an additional step, the emmeans were sorted from largest to smallest.

At this point, the analysis goals have been met: we know the estimated means for each treatment and their rankings.
Expand All @@ -236,7 +231,6 @@ If you want to run ANOVA, it can be done quite easily:
anova(model_rcbd)
```


::: callout-note
## `na.action = na.exclude`

Expand All @@ -251,5 +245,4 @@ model_rcbd <- lmer(yield_bu_a ~ variety + (1|block),
I use the argument `na.action = na.exclude` as instruction for how to handle missing data: conduct the analysis, adjusting as needed for the missing data, and when prediction or residuals are output, please pad them in the appropriate places for missing data so they can be easily merged into the main data set if need be.

Since there are no missing data, this step was not strictly necessary, but it's a good habit to be in.

:::

0 comments on commit ca830d3

Please sign in to comment.