From 9d744756dd7ab97bf2b31d5c47bca1430a2154df Mon Sep 17 00:00:00 2001
From: Harpreet Kaur <hkaur@uidaho.edu>
Date: Wed, 8 Jan 2025 14:16:07 -0800
Subject: [PATCH] reviewed ch IBD and latin sq design

---
 chapters/incomplete-block-design.qmd | 30 ++++-----
 chapters/latin-design.qmd            | 95 ++++++++++++++++++----------
 chapters/repeated-measures.qmd       |  8 +--
 docs/search.json                     | 43 ++++++++-----
 4 files changed, 106 insertions(+), 70 deletions(-)

diff --git a/chapters/incomplete-block-design.qmd b/chapters/incomplete-block-design.qmd
index b61f37e..f2771eb 100644
--- a/chapters/incomplete-block-design.qmd
+++ b/chapters/incomplete-block-design.qmd
@@ -18,7 +18,6 @@ Incomplete block designs are grouped into two groups: (1) balanced lattice desig
 
 In alpha-lattice design, the blocks are grouped into complete replicates. These designs are also termed as "resolvable incomplete block designs" or "partially balanced incomplete block designs" [@paterson]. This design has been more commonly used instead of balanced IBD because of it's practicability, flexibility, and versatility. 
 
-To avoid having a disconnected design, a balanced incomplete block design can be used.
 
 ### Statistical Model
 
@@ -93,7 +92,6 @@ desplot::desplot(dat,
          text=gen, cex=1, out1=block,
         out2=gen, out2.gpar=list(col = "black", lwd = 1, lty = 1),
          main="Incomplete block design")
-
 # desplot::desplot(dat, yield~col*row,
 #           text=gen, shorten='none', cex=.6, out1=block,
 #           aspect=252/96, # true aspect
@@ -232,7 +230,20 @@ emmeans(model_icbd1, ~ gen)
 
 ### Partially Balanced IBD (Alpha Lattice Design)
 
-The data used in this example is published in *Cyclic and Computer Generated Designs* [@john_cyclic]. The data in this trial was laid out in an alpha lattice design. This trial data had 24 genotypes ("gen"), 6 incomplete blocks, each replicated 3 times. 
+The statistical model for partially balanced design includes:
+
+$$y_{ij(l)} = \mu + \alpha_i + \beta_{i(l)} + \tau_j + \epsilon_{ij(l)}$$ 
+
+Where:
+
+$\mu$ = overall experimental mean   
+$\alpha$ = replicate effect (random)  
+$\beta$ = incomplete block effect (random)  
+$\tau$ = treatment effect (fixed)  
+$\epsilon_{ij(l)}$ = intra-block residual  
+
+
+The data used in this example is published in *Cyclic and Computer Generated Designs* [@john_cyclic]. The trial was laid out in an alpha lattice design. This trial data had 24 genotypes ("gen"), 6 incomplete blocks, each replicated 3 times. 
 
 Let's start analyzing this example first by loading the required libraries for linear mixed models:
 
@@ -324,7 +335,7 @@ The response variables seems to follow a normal distribution curve, with fewer v
 ### lme4
 
 ```{r}
-mod_alpha <- lmer(yield ~ gen + (1|rep:block),
+mod_alpha <- lmer(yield ~ gen + (1|rep/block),
                    data = data1, 
                    na.action = na.exclude)
 tidy(mod_alpha)
@@ -338,15 +349,6 @@ mod_alpha1 <- lme(yield ~ gen,
                   data = data1, 
                   na.action = na.exclude)
 tidy(mod_alpha1)
-
-## need to try pdIdent here
-# model_lme <-lme(yield ~  gen,
-#               random = list(one = pdBlocked(list(
-#          pdIdent(~ 0 + rep:block)))),
-#         data = data1 %>% mutate(one = factor(1)))
-# 
-# summary(model_lme)
-
 ```
 :::
 
@@ -366,7 +368,6 @@ check_model(mod_alpha1, check = c('normality', 'linearity'))
 ```
 :::
 
-
 #### Inference
 
 Let's ANOVA table using `anova()` from lmer and lme models, respectively.
@@ -380,7 +381,6 @@ anova(mod_alpha, type = "1")
 #### nlme
 ```{r}
 anova(mod_alpha1, type = "sequential")
-#anova(model_lme, type = "sequential")
 ```
 :::
 
diff --git a/chapters/latin-design.qmd b/chapters/latin-design.qmd
index a02db45..a63f90f 100644
--- a/chapters/latin-design.qmd
+++ b/chapters/latin-design.qmd
@@ -8,24 +8,27 @@ par(mar=c(5.1, 6, 4.1, 2.1))
 
 ## Background
 
-Latin square design In the Latin Square design, two blocking factors are arranged across the row and the column of the square. This allows blocking of two nuisance factors across rows and columns to reduce even more experimental error. The requirement of Latin square design is that all t treatments appears only once in each row and column and number of replications is equal to number of treatments.
+In the Latin Square design, two blocking factors are arranged across the row and the column of the square. This allows blocking of two nuisance factors across rows and columns to reduce even more experimental error. The requirement of Latin square design is that all t treatments appears only once in each row and column and number of replications is equal to number of treatments.
 
 Advantages of Latin square design are:
+
 1.  The design is particularly appropriate for comparing t treatment means in the presence of two sources of extraneous variation, each measured at t levels.
+
 2.  The analysis is quite simple.
 
-Disadvantage: 
-1. A Latin square can be constructed for any value of t, however, it is best suited for comparing t treatments when 5≤t≤10.
+Disadvantages:
+
+1.  A Latin square can be constructed for any value of t, however, it is best suited for comparing t treatments when 5≤ t≤ 10.
 
 2.  Any additional extraneous sources of variability tend to inflate the error term, making it more difficult to detect differences among the treatment means.
 
-3.  The effect of each treatment on the response must be approximately the same across rows and columns.
+3.  The effect of each treatment on the response must be approximately same across the rows and columns.
 
 Statistical model for a response in Latin square design is:
 
 $Y_{ijk} = \mu + \alpha_i + \beta_j +  \gamma_k + \epsilon_{ijk}$
 
-where, $\mu$ is the experiment mean, $\alpha_i's$ are treatment effects, $\beta$ and $\gamma$ are the row- and column specific effects.
+where, $\mu$ is the experiment mean, $\alpha_i's$ represents treatment effect, $\beta$ and $\gamma$ are the row- and column specific effects.
 
 Assumptions of this design includes normality and independent distribution of error ($\epsilon_{ijk}$) terms. And there is no interaction between two blocking (rows & columns) factors and treatments.
 
@@ -40,6 +43,7 @@ Let's start the analysis firstly by loading the required libraries:
 library(lme4); library(lmerTest); library(emmeans); library(performance)
 library(dplyr); library(broom.mixed); library(agridat); library(desplot)
 ```
+
 ### nlme
 
 ```{r, message=FALSE, warning=FALSE}
@@ -53,6 +57,7 @@ The data used in this example is extracted from the `agridat` package. In this e
 ```{r}
 dat <- agridat::goulden.latin
 ```
+
 |       |                               |
 |-------|-------------------------------|
 | trt   | treatment factor, 5 levels    |
@@ -63,10 +68,13 @@ dat <- agridat::goulden.latin
 : Table of variables in the data set {tbl-latin}
 
 ### Data integrity checks
+
 Firstly, let's verify the class of variables in the dataset using `str()` function in base R
+
 ```{r}
 str(dat)
 ```
+
 Here yield and trt are classified as numeric and factor variables, respectively, as needed. But we need to change 'row' and 'col' from integer t factor/character.
 
 ```{r}
@@ -75,28 +83,38 @@ dat1 <- dat |>
                col = as.factor(col))
 ```
 
-Next, to verify if the data meets the assumption of the Latin square design let's plot the field layout for this experiment. 
-```{r}
-desplot::desplot(data = dat, flip = TRUE,
-        form = yield ~ row + col, 
-        out1 = row, out1.gpar=list(col="black", lwd=3),
-        out2 = col, out2.gpar=list(col="black", lwd=3),
-        text = trt, cex = 1, shorten = "no",
-        main = "Field layout", 
-        show.key = FALSE)
+Next, to verify if the data meets the assumption of the Latin square design let's plot the field layout for this experiment.
 
-```
+```{r, echo=FALSE, warning=FALSE}
 
-This looks great! Here we can see that there are equal number of treatments, rows, and columns. Treatments were randomized in such a way that one treatment doesn't appear more than once in each row and column. 
+desplot::desplot(data = dat1, flip = TRUE,
+        form = trt ~ col + row,         
+        text = trt, cex = 0.7, shorten = "no", 
+        out1 = trt,                          
+       # out2 = block,  
+        main = "Alpha Lattice Design", show.key =F) 
+# desplot::desplot(data = dat, flip = TRUE,
+#         form = yield ~ row + col, 
+#         out1 = row, out1.gpar=list(col="black", lwd=3),
+#         out2 = col, out2.gpar=list(col="black", lwd=3),
+#         text = trt, cex = 1, shorten = "no",
+#         main = "Field layout", 
+#         show.key = FALSE)
 
+```
+
+This looks great! Here we can see that there are equal number (5) of treatments, rows, and columns. Treatments were randomized in such a way that one treatment doesn't appear more than once in each row and column.
 
 Next step is to check if there are any missing values in response variable.
+
 ```{r}
 apply(dat, 2, function(x) sum(is.na(x)))
 ```
-And we do not have any missing values in the data.
+
+No missing values detected in this data set.
 
 Before fitting the model, let's create a histogram of response variable to see if there are extreme values.
+
 ```{r, echo=FALSE}
 #| label: lattice_design
 #| fig-cap: "Histogram of the dependent variable."
@@ -110,8 +128,11 @@ hist(dat$yield, main = "", xlab = "yield")
 ```
 
 ### Model fitting
+
 Here we will fit a model to evaluate the impact of fungicide treatments on wheat yield with trt as a fixed effect and row & col as a random effect.
 
+VarCorr(m1_b)
+
 ::: panel-tabset
 ### lme4
 
@@ -119,81 +140,87 @@ Here we will fit a model to evaluate the impact of fungicide treatments on wheat
 m1_a <- lmer(yield ~ trt + (1|row) + (1|col),
            data = dat1,
            na.action = na.exclude)
-tidy(m1_a) 
+summary(m1_a) 
 ```
 
 ### nlme
+
 ```{r}
-dat$dummy <- factor(1)
 m1_b <- lme(yield ~ trt,
           random =list(~1|row, ~1|col),
-                  #list(dummy = pdBlocked(list(
-                   #               pdIdent(~row - 1),
-                    #              pdIdent(~col - 1)))),
           data = dat, 
           na.action = na.exclude)
 
 summary(m1_b)
-#VarCorr(m1_b)
 ```
 :::
 
 ### Check Model Assumptions
 
-::: panel-tabset
+This step involves inspection of model residuals. by using `check_model()` function from the "performance" package.
+
+:::: panel-tabset
 #### lme4
+
 ```{r, fig.height=3}
 check_model(m1_a, check = c("linearity", "normality"))
 ```
 
 #### nlme
 
-::: {layout-ncol=2 .column-body}
-
+::: {.column-body layout-ncol="2"}
 ```{r echo=FALSE, eval=FALSE}
 par(mar=c(5.1, 5, 2.1, 2.1))
 plot(residuals(m1_b), xlab = "fitted values", ylab = "residuals",
      cex.lab = 1.8, cex.axis = 1.5); abline(0,0)
 ```
 
-
 ```{r echo=FALSE, eval=FALSE}
 par(mar=c(5.1, 5, 2.1, 2.1))
 qqvals <- qqnorm(residuals(m1_b), plot.it=FALSE)
 qqplot(qqvals$x, qqvals$y, xlab = "Theoretical Quantiles", ylab = "Sample Quantiles", cex.lab = 1.7, cex.axis = 1.5); qqline(residuals(m1_b))
 ```
-::: 
+:::
 
 ```{r, fig.height=3}
 check_model(m1_b, check = c("linearity", "normality"))
 ```
-:::
+::::
+
+These visuals imply that assumptions of linear model have been met.
 
 ### Inference
-We can look look at the analysis of variance for treatment effect on yield using `anova()` function.
+
+We can now proceed to the variance partioning. In this case, we will use `anova()` with `type = 1` or `type = "sequesntial"` for lmer() and lme() models, respectively.
 
 ::: panel-tabset
 #### lme4
-```{r, fig.height=3}
+
+```{r}
 anova(m1_a, type = "1")
 ```
 
 #### nlme
-```{r, fig.height=3}
+
+```{r}
 anova(m1_b, type = "sequential")
 ```
 :::
 
-Here we observed a significant impact on fungicide treatment on crop yield. Let's have a look at the estimated marginal means of wheat yield with each treatment using `emmeans()` function.
+Both models have detected a significant treatment effect. Here we observed a significant impact on fungicide treatment on crop yield. Let's have a look at the estimated marginal means of wheat yield with each treatment using `emmeans()` function.
 
 ::: panel-tabset
 #### lme4
+
 ```{r, fig.height=3}
 emmeans(m1_a, ~ trt)
 ```
 
 #### nlme
+
 ```{r, fig.height=3}
 emmeans(m1_b, ~ trt)
 ```
-:::
\ No newline at end of file
+:::
+
+We see that wheat yield was higher with 'C' fungicide treatment compared to other fungicides applied in this study. Which implies that 'C' fungicide was more efficient in controlling the stem rust in wheat.
diff --git a/chapters/repeated-measures.qmd b/chapters/repeated-measures.qmd
index 57136b5..30fa9f7 100644
--- a/chapters/repeated-measures.qmd
+++ b/chapters/repeated-measures.qmd
@@ -4,11 +4,11 @@
 source(here::here("settings.r"))
 ```
 
-In the previous chapters we covered how to run linear mixed models for different experiment designs. All of the examples in those chapters were independent measure designs, where each subject was assigned to a different treatment. Now we will move on to experiment with repeated measures random effects.
+In the previous chapters we have covered how to run linear mixed models for different experiment designs. All of the examples in those chapters were independent measure designs, where each subject was assigned to a different treatment. Now we will move on to experiment with repeated measures effects.
 
-Studies that involve repeated observations of the exact same experimental units require a repeated measures component to properly model correlations across time with the experiment unit. This is common in any studies that are evaluated across different time periods. For example, if samples are collected over the different time periods from same subject, we have to repeated measures effect while analyzing the main effects.
+Studies that involve repeated observations of the exact same experimental units (or subjects) requires a repeated measures component in analysis to properly model correlations across time of each subject. This is common in any studies that are evaluated across different time periods. For example, if samples are collected over the different time periods from same subject, we have to model the repeated measures effect while analyzing the main effects.
 
-In these models, the 'iid' assumption (idependently and identically distributed) is being violated, so we need to introduce specialized covariance structures that can account for these correlations between error terms.
+In these models, the 'iid' assumption (independently and identically distributed) is being violated often, so we need to introduce specialized covariance structures that can account for these correlations between error terms.
 
 There are several types of covariance structures:
 
@@ -97,7 +97,6 @@ ggplot(data = dat, aes(y = y, x = factweek, fill = variety)) +
 ```
 
 Looks like variety '1' has the lowest yield and showed drastic reduction in yield over weeks compared to other varieties.
-
 One last step before we fit model is to look at the distribution of response variable.
 
 ```{r, eval=FALSE}
@@ -224,7 +223,6 @@ Firstly, we need to look at the class of variables in the data set.
 ```{r}
 str(Yield)
 ```
-
 We will now convert the fertilizer and Rep into factor. In addition, we need to create a new factor variable (sample_time1) to analyze the time effect.
 
 ::: column-margin
diff --git a/docs/search.json b/docs/search.json
index 339de94..91707a1 100644
--- a/docs/search.json
+++ b/docs/search.json
@@ -192,7 +192,7 @@
     "href": "chapters/incomplete-block-design.html",
     "title": "9  Incomplete Block Design",
     "section": "",
-    "text": "9.1 Background\nThe block design in Chapter 4 was complete, meaning that every block contained all the treatments. In practice, it may not be possible to have too many treatments in each block. Sometimes, there are also situations where it is advised to not have many treatments in each block.\nIn such cases, incomplete block designs are used where we have to decide what subset of treatments to be used in an individual block. This will work well if we enough blocks. However, if we only have small number of blocks, there would be the risk that certain quantities are not estimable anymore.\nIncomplete block designs are grouped into balanced lattice design and partially balanced (or alpha-lattice) designs.\nTo avoid having a disconnected design, a balanced incomplete block design can be used\nThe statistical model for balanced incomplete block design is:\n\\[y_{ij} = \\mu + \\alpha_i + \\beta_j + \\epsilon_{ij}\\] Where:\n\\(\\mu\\) = overall experimental mean \\(\\alpha\\) = treatment effects (fixed) \\(\\beta\\) = block effects (random) \\(\\epsilon\\) = error terms\n\\[ \\epsilon \\sim N(0, \\sigma)\\]\n\\[ \\beta \\sim N(0, \\sigma_b)\\] There are few key points that we need to keep in mind while designing incomplete block designs:\nAn excellent description of incomplete block design is provided in ANOVA and Mixed Models by Lukas Meier.\nThe balanced incomplete block designs are guided by strict principles and guidelines including: the number of treatments must be a perfect square (e.g. 25, 36, and so on); number of replicates must be equal to no. of blocks +1;",
+    "text": "9.1 Background\nThe block design described in Chapter 4 was complete, meaning that each block contained each treatment level at least once. In practice, it may not be possible or advisable to include all treatments in each block, either due to limitations in treatment availability (e.g. limited seed stocks) or the block size becomes too large to serve its original goals of controlling for spatial variation.\nIn such cases, incomplete block designs (IBD) can be used. Incomplete block designs break the experiment into many smaller incomplete blocks that are nested within standard RCBD-style blocks and assigns a subset of the treatment levels to each incomplete block. There are several different approaches Patterson and Williams (1976) for how to assign treatment levels to incomplete blocks and these designs impact the final statistical analysis (and if all treatments included in the experimental design are estimable). An excellent description of incomplete block design is provided in ANOVA and Mixed Models by Lukas Meier.\nIncomplete block designs are grouped into two groups: (1) balanced lattice designs; and (2) partially balanced (also commonly called alpha-lattice) designs. Balanced IBD designs have been previously called “lattice designs” [need refs], but we are not using that term to avoid confusion with alpha-lattice designs, a term that is commonly used.\nIn alpha-lattice design, the blocks are grouped into complete replicates. These designs are also termed as “resolvable incomplete block designs” or “partially balanced incomplete block designs” (paterson?). This design has been more commonly used instead of balanced IBD because of it’s practicability, flexibility, and versatility.",
     "crumbs": [
       "Experiment designs",
       "<span class='chapter-number'>9</span>  <span class='chapter-title'>Incomplete Block Design</span>"
@@ -203,7 +203,7 @@
     "href": "chapters/incomplete-block-design.html#background",
     "title": "9  Incomplete Block Design",
     "section": "",
-    "text": "A drawback of this design is that block effect and treatment effects are confounded.\nTo eliminate the block effects, better compare treatments within a block.\nNo treatment should appear twice in any block as they contributes nothing to within block comparisons.\n\n\n\n\n\n\n\n\n\nA note\n\n\n\nBecause the blocks are incomplete, the Type I and Type III sums of squares will be different. That is, the missing treatments in each block represent missing observations (but not missing ‘at random’).",
+    "text": "9.1.1 Statistical Model\nThe statistical model for a balanced incomplete block design is:\n\\[y_{ij} = \\mu + \\alpha_i + \\beta_j + \\epsilon_{ij}\\]\nWhere:\n\\(\\mu\\) = overall experimental mean\n\\(\\alpha\\) = treatment effects (fixed)\n\\(\\beta\\) = block effects (random)\n\\(\\epsilon\\) = error terms\n\\[ \\epsilon \\sim N(0, \\sigma)\\]\n\\[ \\beta \\sim N(0, \\sigma_b)\\]\nThere are few key points that we need to keep in mind while designing incomplete block experiments:\n\nA drawback of this design is that block effect and treatment effects are confounded.\nTo remove the block effects, it is better compare treatments within a block.\nNo treatment should appear twice in any block as it contributes nothing to within block comparisons.\n\nThe balanced incomplete block designs are guided by strict principles and guidelines including: the number of treatments must be a perfect square (e.g. 25, 36, and so on), and number of replicates must be equal to number of blocks +1.\n\n\n\n\n\n\nNote on Sums of Squares\n\n\n\nBecause the blocks are incomplete, the Type I and Type III sums of squares will be different even when there is no missing data from a trail. That is because the missing treatments in each block represent missing observations (even though they are not missing ‘at random’).",
     "crumbs": [
       "Experiment designs",
       "<span class='chapter-number'>9</span>  <span class='chapter-title'>Incomplete Block Design</span>"
@@ -501,7 +501,7 @@
     "href": "chapters/repeated-measures.html#rcbd-repeated-measures",
     "title": "11  Repeated measures mixed models",
     "section": "12.1 RCBD Repeated Measures",
-    "text": "12.1 RCBD Repeated Measures\nThe example shown below contains data from a sorghum trial laid out as a randomized complete block design (5 blocks) with variety (4 varieties) treatment effect. The response variable ‘y’ is the leaf area index assessed in five consecutive weeks on each plot.\nWe need to have time as numeric and factor variable. In the model, to assess the week effect, week was used as a factor (factweek). For the correlation matrix, week needs to be numeric (week).\n\ndat &lt;- agriTutorial::sorghum %&gt;%   \n  mutate(week = as.numeric(factweek),\n         block = as.character(varblock)) \n\n\nTable of variables in the data set\n\n\nblock\nblocking unit\n\n\nReplicate\nreplication unit\n\n\nWeek\nTime points when data was collected\n\n\nvariety\ntreatment factor, 4 levels\n\n\ny\nyield (lbs)\n\n\n\n\n12.1.1 Data Integrity Checks\nLet’s do preliminary data check including evaluating data structure, distribution of treatments, number of missing values, and distribution of response variable.\n\nstr(dat)\n\n'data.frame':   100 obs. of  9 variables:\n $ y        : num  5 4.84 4.02 3.75 3.13 4.42 4.3 3.67 3.23 2.83 ...\n $ variety  : Factor w/ 4 levels \"1\",\"2\",\"3\",\"4\": 1 1 1 1 1 1 1 1 1 1 ...\n $ Replicate: Factor w/ 5 levels \"1\",\"2\",\"3\",\"4\",..: 1 1 1 1 1 2 2 2 2 2 ...\n $ factweek : Factor w/ 5 levels \"1\",\"2\",\"3\",\"4\",..: 1 2 3 4 5 1 2 3 4 5 ...\n $ factplot : Factor w/ 20 levels \"1\",\"2\",\"3\",\"4\",..: 1 1 1 1 1 2 2 2 2 2 ...\n $ varweek  : int  1 2 3 4 5 1 2 3 4 5 ...\n $ varblock : int  1 1 1 1 1 2 2 2 2 2 ...\n $ week     : num  1 2 3 4 5 1 2 3 4 5 ...\n $ block    : chr  \"1\" \"1\" \"1\" \"1\" ...\n\n\nIn this data, we have block, factplot, factweek as factor variables and y & week as numeric.\n\ntable(dat$variety, dat$block)\n\n   \n    1 2 3 4 5\n  1 5 5 5 5 5\n  2 5 5 5 5 5\n  3 5 5 5 5 5\n  4 5 5 5 5 5\n\n\nThe cross tabulation shows a equal number of varieties in each block.\n\nggplot(data = dat, aes(y = y, x = factweek, fill = variety)) +\n  geom_boxplot() +  \n  #scale_fill_brewer(palette=\"Dark2\") +\n  scale_fill_viridis_d(option = \"F\") +\n    theme_bw()\n\n\n\n\n\n\n\n\nLooks like variety ‘1’ has the lowest yield and showed drastic reduction in yield over weeks compared to other varieties.\nOne last step before we fit model is to look at the distribution of response variable.\n\nhist(dat$y, main = \"\", xlab = \"yield\")\n\n\n\n\n\n\n\n\n\n\nFigure 12.1: Histogram of the dependent variable.\n\n\n\n\n\n\n12.1.2 Model Building\nLet’s fit the basic model first using lme() from the nlme package.\n\nlm1 &lt;- lme(y ~ variety + factweek + variety:factweek, random = ~1|block/factplot,\n              data = dat,\n              na.action = na.exclude)\n\nThe model fitted above doesn’t account for the repeated measures effect. To account for the variation caused by repeated measurements, we can model the correlation among responses for a given subject which is plot (factor variable) in this case.\nBy adding this correlation structure, what we are implying is to keep each plot independent, but to allowing AR1 or compound symmetry correlations between responses for a given subject, here time variable is week and it must be numeric.\n\ncs1 &lt;- corAR1(form = ~ week|block/factplot,  value = 0.2, fixed = FALSE)\ncs2 &lt;- corCompSymm(form = ~ week|block/factplot,  value = 0.2, fixed = FALSE)\n\nIn the code chunk above, we fitted two correlation structures including AR1 and compound symmetry matrices. Next we will update the model lm1, with these two matrices. In nlme, please search the help tool to know more about functions for different correlation structure classes.\n\nlm2 &lt;- update(lm1, corr = cs1)\nlm3 &lt;- update(lm1, corr= cs2)\n\nNow let’s compare how model fitness differs among models with no correlation structure (lm1), with AR1 correlation structure (lm2), and with compound symmetry structure (lm3). We will compare these models by using anova() or by compare_performance() function from the ‘performance’ library.\n\nanovaperformance\n\n\n\nanova(lm1, lm2, lm3)\n\n    Model df       AIC      BIC   logLik   Test  L.Ratio p-value\nlm1     1 23 18.837478 73.62409 13.58126                        \nlm2     2 24 -2.347391 54.82125 25.17370 1 vs 2 23.18487  &lt;.0001\nlm3     3 24 20.837478 78.00612 13.58126                        \n\n\n\n\n\nresult &lt;- compare_performance(lm1, lm2, lm3)\n\nSome of the nested models seem to be identical and probably only vary in\n  their random effects.\n\nprint_md(result)\n\n\nComparison of Model Performance Indices\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nModel\nAIC (weights)\nAICc (weights)\nBIC (weights)\nR2 (cond.)\nR2 (marg.)\nICC\nRMSE\nSigma\n\n\n\n\nlm1\nlme\n-50.5 (&lt;.001)\n-36.0 (&lt;.001)\n9.4 (&lt;.001)\n0.99\n0.37\n0.98\n0.10\n0.13\n\n\nlm2\nlme\n-77.5 (&gt;.999)\n-61.5 (&gt;.999)\n-15.0 (&gt;.999)\n0.97\n0.41\n0.95\n0.15\n0.18\n\n\nlm3\nlme\n-48.5 (&lt;.001)\n-32.5 (&lt;.001)\n14.0 (&lt;.001)\n0.98\n0.37\n0.98\n0.11\n0.14\n\n\n\n\n\n\n\n\nWe prefer to chose model with lower AIC and BIC values. In this scenario, we will move forward with lm2 model containing AR1 structure.\nLet’s run a tidy() on lm2 model to look at the estimates for random and fixed effects.\n\ntidy(lm2)\n\nWarning in tidy.lme(lm2): ran_pars not yet implemented for multiple levels of\nnesting\n\n\n# A tibble: 20 × 7\n   effect term               estimate std.error    df statistic  p.value\n   &lt;chr&gt;  &lt;chr&gt;                 &lt;dbl&gt;     &lt;dbl&gt; &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;\n 1 fixed  (Intercept)          4.24      0.291     64    14.6   5.44e-22\n 2 fixed  variety2             0.906     0.114     12     7.94  4.05e- 6\n 3 fixed  variety3             0.646     0.114     12     5.66  1.05e- 4\n 4 fixed  variety4             0.912     0.114     12     8.00  3.78e- 6\n 5 fixed  factweek2           -0.196     0.0571    64    -3.44  1.04e- 3\n 6 fixed  factweek3           -0.836     0.0755    64   -11.1   1.60e-16\n 7 fixed  factweek4           -1.16      0.0867    64   -13.3   4.00e-20\n 8 fixed  factweek5           -1.54      0.0943    64   -16.3   1.57e-24\n 9 fixed  variety2:factweek2   0.0280    0.0807    64     0.347 7.30e- 1\n10 fixed  variety3:factweek2   0.382     0.0807    64     4.73  1.26e- 5\n11 fixed  variety4:factweek2  -0.0140    0.0807    64    -0.174 8.63e- 1\n12 fixed  variety2:factweek3   0.282     0.107     64     2.64  1.03e- 2\n13 fixed  variety3:factweek3   0.662     0.107     64     6.20  4.55e- 8\n14 fixed  variety4:factweek3   0.388     0.107     64     3.64  5.55e- 4\n15 fixed  variety2:factweek4   0.228     0.123     64     1.86  6.77e- 2\n16 fixed  variety3:factweek4   0.744     0.123     64     6.06  7.86e- 8\n17 fixed  variety4:factweek4   0.390     0.123     64     3.18  2.28e- 3\n18 fixed  variety2:factweek5   0.402     0.133     64     3.01  3.70e- 3\n19 fixed  variety3:factweek5   0.672     0.133     64     5.04  4.11e- 6\n20 fixed  variety4:factweek5   0.222     0.133     64     1.66  1.01e- 1\n\n\n\n\n12.1.3 Check Model Assumptions\n\ncheck_model(lm2, check = c('normality', 'linearity'))\n\n\n\n\n\n\n\n\n\n\n12.1.4 Inference\nThe ANOVA table suggests a highly significant effect of the variety, week, and variety x week interaction effect.\n\nanova(lm2, type = \"marginal\")\n\n                 numDF denDF   F-value p-value\n(Intercept)          1    64 212.10509  &lt;.0001\nvariety              3    12  28.28895  &lt;.0001\nfactweek             4    64  74.79758  &lt;.0001\nvariety:factweek    12    64   7.03546  &lt;.0001\n\n\nWe can estimate the marginal means for variety and week effect and their interaction using emmeans() function.\n\nmean_1 &lt;- emmeans(lm2, ~ variety)\n\nNOTE: Results may be misleading due to involvement in interactions\n\nmean_1\n\n variety emmean    SE df lower.CL upper.CL\n 1         3.50 0.288  4     2.70     4.29\n 2         4.59 0.288  4     3.79     5.39\n 3         4.63 0.288  4     3.84     5.43\n 4         4.61 0.288  4     3.81     5.40\n\nResults are averaged over the levels of: factweek \nDegrees-of-freedom method: containment \nConfidence level used: 0.95 \n\nmean_2 &lt;- emmeans(lm2, ~ variety*factweek)\nmean_2\n\n variety factweek emmean    SE df lower.CL upper.CL\n 1       1          4.24 0.291  4     3.43     5.05\n 2       1          5.15 0.291  4     4.34     5.96\n 3       1          4.89 0.291  4     4.08     5.70\n 4       1          5.15 0.291  4     4.35     5.96\n 1       2          4.05 0.291  4     3.24     4.85\n 2       2          4.98 0.291  4     4.17     5.79\n 3       2          5.07 0.291  4     4.27     5.88\n 4       2          4.94 0.291  4     4.14     5.75\n 1       3          3.41 0.291  4     2.60     4.21\n 2       3          4.59 0.291  4     3.79     5.40\n 3       3          4.71 0.291  4     3.91     5.52\n 4       3          4.71 0.291  4     3.90     5.51\n 1       4          3.09 0.291  4     2.28     3.89\n 2       4          4.22 0.291  4     3.41     5.03\n 3       4          4.48 0.291  4     3.67     5.28\n 4       4          4.39 0.291  4     3.58     5.20\n 1       5          2.70 0.291  4     1.89     3.51\n 2       5          4.01 0.291  4     3.20     4.82\n 3       5          4.02 0.291  4     3.21     4.83\n 4       5          3.83 0.291  4     3.03     4.64\n\nDegrees-of-freedom method: containment \nConfidence level used: 0.95 \n\n\n\n\n\n\n\n\nTime variable\n\n\n\nHere is a quick step to make sure your fitting model correctly: make sure to have two time variables in your data one being numeric (e.g. ‘day’ as number) and other being factor/character(e.g. ‘day_factor’ as a factor/character). Where, numeric variable is used for fitting correlation matrix and factor/character variable used in model statement to evaluate the time variable effect on response variable.",
+    "text": "12.1 RCBD Repeated Measures\nThe example shown below contains data from a sorghum trial laid out as a randomized complete block design (5 blocks) with variety (4 varieties) treatment effect. The response variable ‘y’ is the leaf area index assessed in five consecutive weeks on each plot.\nWe need to have time as numeric and factor variable. In the model, to assess the week effect, week was used as a factor (factweek). For the correlation matrix, week needs to be numeric (week).\n\ndat &lt;- agriTutorial::sorghum %&gt;%   \n  mutate(week = as.numeric(factweek),\n         block = as.character(varblock)) \n\n\nTable of variables in the data set\n\n\nblock\nblocking unit\n\n\nReplicate\nreplication unit\n\n\nWeek\nTime points when data was collected\n\n\nvariety\ntreatment factor, 4 levels\n\n\ny\nyield (lbs)\n\n\n\n\n12.1.1 Data Integrity Checks\nLet’s do preliminary data check including evaluating data structure, distribution of treatments, number of missing values, and distribution of response variable.\n\nstr(dat)\n\n'data.frame':   100 obs. of  9 variables:\n $ y        : num  5 4.84 4.02 3.75 3.13 4.42 4.3 3.67 3.23 2.83 ...\n $ variety  : Factor w/ 4 levels \"1\",\"2\",\"3\",\"4\": 1 1 1 1 1 1 1 1 1 1 ...\n $ Replicate: Factor w/ 5 levels \"1\",\"2\",\"3\",\"4\",..: 1 1 1 1 1 2 2 2 2 2 ...\n $ factweek : Factor w/ 5 levels \"1\",\"2\",\"3\",\"4\",..: 1 2 3 4 5 1 2 3 4 5 ...\n $ factplot : Factor w/ 20 levels \"1\",\"2\",\"3\",\"4\",..: 1 1 1 1 1 2 2 2 2 2 ...\n $ varweek  : int  1 2 3 4 5 1 2 3 4 5 ...\n $ varblock : int  1 1 1 1 1 2 2 2 2 2 ...\n $ week     : num  1 2 3 4 5 1 2 3 4 5 ...\n $ block    : chr  \"1\" \"1\" \"1\" \"1\" ...\n\n\nIn this data, we have block, factplot, factweek as factor variables and y & week as numeric.\n\ntable(dat$variety, dat$block)\n\n   \n    1 2 3 4 5\n  1 5 5 5 5 5\n  2 5 5 5 5 5\n  3 5 5 5 5 5\n  4 5 5 5 5 5\n\n\nThe cross tabulation shows a equal number of varieties in each block.\n\nggplot(data = dat, aes(y = y, x = factweek, fill = variety)) +\n  geom_boxplot() +  \n  #scale_fill_brewer(palette=\"Dark2\") +\n  scale_fill_viridis_d(option = \"F\") +\n    theme_bw()\n\n\n\n\n\n\n\n\nLooks like variety ‘1’ has the lowest yield and showed drastic reduction in yield over weeks compared to other varieties. One last step before we fit model is to look at the distribution of response variable.\n\nhist(dat$y, main = \"\", xlab = \"yield\")\n\n\n\n\n\n\n\n\n\n\nFigure 12.1: Histogram of the dependent variable.\n\n\n\n\n\n\n12.1.2 Model Building\nLet’s fit the basic model first using lme() from the nlme package.\n\nlm1 &lt;- lme(y ~ variety + factweek + variety:factweek, random = ~1|block/factplot,\n              data = dat,\n              na.action = na.exclude)\n\nThe model fitted above doesn’t account for the repeated measures effect. To account for the variation caused by repeated measurements, we can model the correlation among responses for a given subject which is plot (factor variable) in this case.\nBy adding this correlation structure, what we are implying is to keep each plot independent, but to allowing AR1 or compound symmetry correlations between responses for a given subject, here time variable is week and it must be numeric.\n\ncs1 &lt;- corAR1(form = ~ week|block/factplot,  value = 0.2, fixed = FALSE)\ncs2 &lt;- corCompSymm(form = ~ week|block/factplot,  value = 0.2, fixed = FALSE)\n\nIn the code chunk above, we fitted two correlation structures including AR1 and compound symmetry matrices. Next we will update the model lm1, with these two matrices. In nlme, please search the help tool to know more about functions for different correlation structure classes.\n\nlm2 &lt;- update(lm1, corr = cs1)\nlm3 &lt;- update(lm1, corr= cs2)\n\nNow let’s compare how model fitness differs among models with no correlation structure (lm1), with AR1 correlation structure (lm2), and with compound symmetry structure (lm3). We will compare these models by using anova() or by compare_performance() function from the ‘performance’ library.\n\nanovaperformance\n\n\n\nanova(lm1, lm2, lm3)\n\n    Model df       AIC      BIC   logLik   Test  L.Ratio p-value\nlm1     1 23 18.837478 73.62409 13.58126                        \nlm2     2 24 -2.347391 54.82125 25.17370 1 vs 2 23.18487  &lt;.0001\nlm3     3 24 20.837478 78.00612 13.58126                        \n\n\n\n\n\nresult &lt;- compare_performance(lm1, lm2, lm3)\n\nSome of the nested models seem to be identical and probably only vary in\n  their random effects.\n\nprint_md(result)\n\n\nComparison of Model Performance Indices\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nName\nModel\nAIC (weights)\nAICc (weights)\nBIC (weights)\nR2 (cond.)\nR2 (marg.)\nICC\nRMSE\nSigma\n\n\n\n\nlm1\nlme\n-50.5 (&lt;.001)\n-36.0 (&lt;.001)\n9.4 (&lt;.001)\n0.99\n0.37\n0.98\n0.10\n0.13\n\n\nlm2\nlme\n-77.5 (&gt;.999)\n-61.5 (&gt;.999)\n-15.0 (&gt;.999)\n0.97\n0.41\n0.95\n0.15\n0.18\n\n\nlm3\nlme\n-48.5 (&lt;.001)\n-32.5 (&lt;.001)\n14.0 (&lt;.001)\n0.98\n0.37\n0.98\n0.11\n0.14\n\n\n\n\n\n\n\n\nWe prefer to chose model with lower AIC and BIC values. In this scenario, we will move forward with lm2 model containing AR1 structure.\nLet’s run a tidy() on lm2 model to look at the estimates for random and fixed effects.\n\ntidy(lm2)\n\nWarning in tidy.lme(lm2): ran_pars not yet implemented for multiple levels of\nnesting\n\n\n# A tibble: 20 × 7\n   effect term               estimate std.error    df statistic  p.value\n   &lt;chr&gt;  &lt;chr&gt;                 &lt;dbl&gt;     &lt;dbl&gt; &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;\n 1 fixed  (Intercept)          4.24      0.291     64    14.6   5.44e-22\n 2 fixed  variety2             0.906     0.114     12     7.94  4.05e- 6\n 3 fixed  variety3             0.646     0.114     12     5.66  1.05e- 4\n 4 fixed  variety4             0.912     0.114     12     8.00  3.78e- 6\n 5 fixed  factweek2           -0.196     0.0571    64    -3.44  1.04e- 3\n 6 fixed  factweek3           -0.836     0.0755    64   -11.1   1.60e-16\n 7 fixed  factweek4           -1.16      0.0867    64   -13.3   4.00e-20\n 8 fixed  factweek5           -1.54      0.0943    64   -16.3   1.57e-24\n 9 fixed  variety2:factweek2   0.0280    0.0807    64     0.347 7.30e- 1\n10 fixed  variety3:factweek2   0.382     0.0807    64     4.73  1.26e- 5\n11 fixed  variety4:factweek2  -0.0140    0.0807    64    -0.174 8.63e- 1\n12 fixed  variety2:factweek3   0.282     0.107     64     2.64  1.03e- 2\n13 fixed  variety3:factweek3   0.662     0.107     64     6.20  4.55e- 8\n14 fixed  variety4:factweek3   0.388     0.107     64     3.64  5.55e- 4\n15 fixed  variety2:factweek4   0.228     0.123     64     1.86  6.77e- 2\n16 fixed  variety3:factweek4   0.744     0.123     64     6.06  7.86e- 8\n17 fixed  variety4:factweek4   0.390     0.123     64     3.18  2.28e- 3\n18 fixed  variety2:factweek5   0.402     0.133     64     3.01  3.70e- 3\n19 fixed  variety3:factweek5   0.672     0.133     64     5.04  4.11e- 6\n20 fixed  variety4:factweek5   0.222     0.133     64     1.66  1.01e- 1\n\n\n\n\n12.1.3 Check Model Assumptions\n\ncheck_model(lm2, check = c('normality', 'linearity'))\n\n\n\n\n\n\n\n\n\n\n12.1.4 Inference\nThe ANOVA table suggests a highly significant effect of the variety, week, and variety x week interaction effect.\n\nanova(lm2, type = \"marginal\")\n\n                 numDF denDF   F-value p-value\n(Intercept)          1    64 212.10509  &lt;.0001\nvariety              3    12  28.28895  &lt;.0001\nfactweek             4    64  74.79758  &lt;.0001\nvariety:factweek    12    64   7.03546  &lt;.0001\n\n\nWe can estimate the marginal means for variety and week effect and their interaction using emmeans() function.\n\nmean_1 &lt;- emmeans(lm2, ~ variety)\n\nNOTE: Results may be misleading due to involvement in interactions\n\nmean_1\n\n variety emmean    SE df lower.CL upper.CL\n 1         3.50 0.288  4     2.70     4.29\n 2         4.59 0.288  4     3.79     5.39\n 3         4.63 0.288  4     3.84     5.43\n 4         4.61 0.288  4     3.81     5.40\n\nResults are averaged over the levels of: factweek \nDegrees-of-freedom method: containment \nConfidence level used: 0.95 \n\nmean_2 &lt;- emmeans(lm2, ~ variety*factweek)\nmean_2\n\n variety factweek emmean    SE df lower.CL upper.CL\n 1       1          4.24 0.291  4     3.43     5.05\n 2       1          5.15 0.291  4     4.34     5.96\n 3       1          4.89 0.291  4     4.08     5.70\n 4       1          5.15 0.291  4     4.35     5.96\n 1       2          4.05 0.291  4     3.24     4.85\n 2       2          4.98 0.291  4     4.17     5.79\n 3       2          5.07 0.291  4     4.27     5.88\n 4       2          4.94 0.291  4     4.14     5.75\n 1       3          3.41 0.291  4     2.60     4.21\n 2       3          4.59 0.291  4     3.79     5.40\n 3       3          4.71 0.291  4     3.91     5.52\n 4       3          4.71 0.291  4     3.90     5.51\n 1       4          3.09 0.291  4     2.28     3.89\n 2       4          4.22 0.291  4     3.41     5.03\n 3       4          4.48 0.291  4     3.67     5.28\n 4       4          4.39 0.291  4     3.58     5.20\n 1       5          2.70 0.291  4     1.89     3.51\n 2       5          4.01 0.291  4     3.20     4.82\n 3       5          4.02 0.291  4     3.21     4.83\n 4       5          3.83 0.291  4     3.03     4.64\n\nDegrees-of-freedom method: containment \nConfidence level used: 0.95 \n\n\n\n\n\n\n\n\nTime variable\n\n\n\nHere is a quick step to make sure your fitting model correctly: make sure to have two time variables in your data one being numeric (e.g. ‘day’ as number) and other being factor/character(e.g. ‘day_factor’ as a factor/character). Where, numeric variable is used for fitting correlation matrix and factor/character variable used in model statement to evaluate the time variable effect on response variable.",
     "crumbs": [
       "<span class='chapter-number'>11</span>  <span class='chapter-title'>Repeated Measures</span>"
     ]
@@ -549,11 +549,11 @@
   {
     "objectID": "chapters/additional-resources.html",
     "href": "chapters/additional-resources.html",
-    "title": "14  Additional Resources",
+    "title": "15  Additional Resources",
     "section": "",
-    "text": "14.1 Further Reading",
+    "text": "15.1 Further Reading",
     "crumbs": [
-      "<span class='chapter-number'>14</span>  <span class='chapter-title'>Additional Resources</span>"
+      "<span class='chapter-number'>15</span>  <span class='chapter-title'>Additional Resources</span>"
     ]
   },
   {
@@ -579,21 +579,21 @@
   {
     "objectID": "chapters/additional-resources.html#further-reading",
     "href": "chapters/additional-resources.html#further-reading",
-    "title": "14  Additional Resources",
+    "title": "15  Additional Resources",
     "section": "",
-    "text": "lme4 vignette for fitting linear mixed models\nMixed-Effects Models in S and S-PLUS thee book for nlme, by José C. Pinheiro and Douglas M. Bates. We used this book extensively for developing this guide. Sadly, it’s both out of print and we could not find a free copy online. However, there are affordable used copies available.\nMixed Effects Models and Extensions in Ecology with R by Alain F. Zuur, Elena N. Ieno, Neil Walker, Anatoly A. Saveliev, and Graham M. Smith.",
+    "text": "lme4 vignette for fitting linear mixed models\nMixed-Effects Models in S and S-PLUS thee book for nlme, by José C. Pinheiro and Douglas M. Bates. We used this book extensively for developing this guide. Sadly, it’s both out of print and we could not find a free copy online. However, there are affordable used copies available.\nMixed Effects Models and Extensions in Ecology with R by Alain F. Zuur, Elena N. Ieno, Neil Walker, Anatoly A. Saveliev, and Graham M. Smith.\nANOVA and Mixed Models by Lukas Meier",
     "crumbs": [
-      "<span class='chapter-number'>14</span>  <span class='chapter-title'>Additional Resources</span>"
+      "<span class='chapter-number'>15</span>  <span class='chapter-title'>Additional Resources</span>"
     ]
   },
   {
     "objectID": "chapters/additional-resources.html#other-resources",
     "href": "chapters/additional-resources.html#other-resources",
-    "title": "14  Additional Resources",
-    "section": "14.2 Other Resources",
-    "text": "14.2 Other Resources\n\nEasy Stats a collection of R packages to assist in statistical modelling, with a big focus on linear models.\nMixed Model CRAN Task View a curated list of R packages relevant to mixed modelling. This is a great place to start\nR-SIG-mixed-models mailing list for help and discussion of mixed-model-related questions, course announcements, etc\nGrammar of Experimental Designs by Emi Tanaka. This has a great description of basic principles of experimental design.",
+    "title": "15  Additional Resources",
+    "section": "15.2 Other Resources",
+    "text": "15.2 Other Resources\n\nEasy Stats a collection of R packages to assist in statistical modelling, with a big focus on linear models.\nMixed Model CRAN Task View a curated list of R packages relevant to mixed modelling. This is a great place to start\nR-SIG-mixed-models mailing list for help and discussion of mixed-model-related questions, course announcements, etc\nGrammar of Experimental Designs by Emi Tanaka. This has a great description of basic principles of experimental design.",
     "crumbs": [
-      "<span class='chapter-number'>14</span>  <span class='chapter-title'>Additional Resources</span>"
+      "<span class='chapter-number'>15</span>  <span class='chapter-title'>Additional Resources</span>"
     ]
   },
   {
@@ -775,7 +775,7 @@
     "href": "chapters/latin-design.html",
     "title": "10  Latin Square Design",
     "section": "",
-    "text": "10.1 Background\nLatin square design In the Latin Square design, two blocking factors are arranged across the row and the column of the square. This allows blocking of two nuisance factors across rows and columns to reduce even more experimental error. The requirement of Latin square design is that all t treatments appears only once in each row and column and number of replications is equal to number of treatments.\nAdvantages of Latin square design are: 1. The design is particularly appropriate for comparing t treatment means in the presence of two sources of extraneous variation, each measured at t levels. 2. The analysis is quite simple.\nDisadvantage: 1. A Latin square can be constructed for any value of t, however, it is best suited for comparing t treatments when 5≤t≤10.\nStatistical model for a response in Latin square design is:\n\\(Y_{ijk} = \\mu + \\alpha_i + \\beta_j +  \\gamma_k + \\epsilon_{ijk}\\)\nwhere, \\(\\mu\\) is the experiment mean, \\(\\alpha_i's\\) are treatment effects, \\(\\beta\\) and \\(\\gamma\\) are the row- and column specific effects.\nAssumptions of this design includes normality and independent distribution of error (\\(\\epsilon_{ijk}\\)) terms. And there is no interaction between two blocking (rows & columns) factors and treatments.",
+    "text": "10.1 Background\nIn the Latin Square design, two blocking factors are arranged across the row and the column of the square. This allows blocking of two nuisance factors across rows and columns to reduce even more experimental error. The requirement of Latin square design is that all t treatments appears only once in each row and column and number of replications is equal to number of treatments.\nAdvantages of Latin square design are:\nDisadvantages:\nStatistical model for a response in Latin square design is:\n\\(Y_{ijk} = \\mu + \\alpha_i + \\beta_j +  \\gamma_k + \\epsilon_{ijk}\\)\nwhere, \\(\\mu\\) is the experiment mean, \\(\\alpha_i's\\) represents treatment effect, \\(\\beta\\) and \\(\\gamma\\) are the row- and column specific effects.\nAssumptions of this design includes normality and independent distribution of error (\\(\\epsilon_{ijk}\\)) terms. And there is no interaction between two blocking (rows & columns) factors and treatments.",
     "crumbs": [
       "Experiment designs",
       "<span class='chapter-number'>10</span>  <span class='chapter-title'>Latin Square Design</span>"
@@ -786,7 +786,7 @@
     "href": "chapters/latin-design.html#example-analysis",
     "title": "10  Latin Square Design",
     "section": "10.2 Example Analysis",
-    "text": "10.2 Example Analysis\nLet’s start the analysis firstly by loading the required libraries:\n\nlme4nlme\n\n\n\nlibrary(lme4); library(lmerTest); library(emmeans); library(performance)\nlibrary(dplyr); library(broom.mixed); library(agridat); library(desplot)\n\n\n\n\nlibrary(nlme); library(broom.mixed); library(emmeans); library(performance)\nlibrary(dplyr); library(agridat); library(desplot)\n\n\n\n\nThe data used in this example is extracted from the agridat package. In this experiment, 5 treatments (A = Dusted before rains. B = Dusted after rains. C = Dusted once each week. D = Drifting, once each week. E = Not dusted) were tested to control stem rust in wheat.\n\ndat &lt;- agridat::goulden.latin\n\n\nTable of variables in the data set\n\n\ntrt\ntreatment factor, 5 levels\n\n\nrow\nrow position for each plot\n\n\ncol\ncolumn position for each plot\n\n\nyield\nwheat yield\n\n\n\n\n10.2.1 Data integrity checks\nFirstly, let’s verify the class of variables in the dataset using str() function in base R\n\nstr(dat)\n\n'data.frame':   25 obs. of  4 variables:\n $ trt  : Factor w/ 5 levels \"A\",\"B\",\"C\",\"D\",..: 2 3 4 5 1 4 1 3 2 5 ...\n $ yield: num  4.9 9.3 7.6 5.3 9.3 6.4 4 15.4 7.6 6.3 ...\n $ row  : int  5 4 3 2 1 5 4 3 2 1 ...\n $ col  : int  1 1 1 1 1 2 2 2 2 2 ...\n\n\nHere yield and trt are classified as numeric and factor variables, respectively, as needed. But we need to change ‘row’ and ‘col’ from integer t factor/character.\n\ndat1 &lt;- dat |&gt; \n        mutate(row = as.factor(row),\n               col = as.factor(col))\n\nNext, to verify if the data meets the assumption of the Latin square design let’s plot the field layout for this experiment.\n\ndesplot::desplot(data = dat, flip = TRUE,\n        form = yield ~ row + col, \n        out1 = row, out1.gpar=list(col=\"black\", lwd=3),\n        out2 = col, out2.gpar=list(col=\"black\", lwd=3),\n        text = trt, cex = 1, shorten = \"no\",\n        main = \"Field layout\", \n        show.key = FALSE)\n\n\n\n\n\n\n\n\nThis looks great! Here we can see that there are equal number of treatments, rows, and columns. Treatments were randomized in such a way that one treatment doesn’t appear more than once in each row and column.\nNext step is to check if there are any missing values in response variable.\n\napply(dat, 2, function(x) sum(is.na(x)))\n\n  trt yield   row   col \n    0     0     0     0 \n\n\nAnd we do not have any missing values in the data.\nBefore fitting the model, let’s create a histogram of response variable to see if there are extreme values.\n\n\n\n\n\n\nHistogram of the dependent variable.\n\n\n\n\nhist(dat$yield, main = \"\", xlab = \"yield\")\n\n\n\n10.2.2 Model fitting\nHere we will fit a model to evaluate the impact of fungicide treatments on wheat yield with trt as a fixed effect and row & col as a random effect.\n\nlme4nlme\n\n\n\nm1_a &lt;- lmer(yield ~ trt + (1|row) + (1|col),\n           data = dat1,\n           na.action = na.exclude)\ntidy(m1_a) \n\n# A tibble: 8 × 8\n  effect   group    term            estimate std.error statistic    df   p.value\n  &lt;chr&gt;    &lt;chr&gt;    &lt;chr&gt;              &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt; &lt;dbl&gt;     &lt;dbl&gt;\n1 fixed    &lt;NA&gt;     (Intercept)        6.84      0.942     7.26   11.9   1.03e-5\n2 fixed    &lt;NA&gt;     trtB              -0.380     0.967    -0.393  12.0   7.01e-1\n3 fixed    &lt;NA&gt;     trtC               6.28      0.967     6.50   12.0   2.96e-5\n4 fixed    &lt;NA&gt;     trtD               1.12      0.967     1.16   12.0   2.69e-1\n5 fixed    &lt;NA&gt;     trtE              -1.92      0.967    -1.99   12.0   7.04e-2\n6 ran_pars row      sd__(Intercept)    1.37     NA        NA      NA    NA      \n7 ran_pars col      sd__(Intercept)    0.483    NA        NA      NA    NA      \n8 ran_pars Residual sd__Observation    1.53     NA        NA      NA    NA      \n\n\n\n\n\ndat$dummy &lt;- factor(1)\nm1_b &lt;- lme(yield ~ trt,\n          random =list(~1|row, ~1|col),\n                  #list(dummy = pdBlocked(list(\n                   #               pdIdent(~row - 1),\n                    #              pdIdent(~col - 1)))),\n          data = dat, \n          na.action = na.exclude)\n\nsummary(m1_b)\n\nLinear mixed-effects model fit by REML\n  Data: dat \n       AIC      BIC    logLik\n  106.0974 114.0633 -45.04872\n\nRandom effects:\n Formula: ~1 | row\n        (Intercept)\nStdDev:    1.344469\n\n Formula: ~1 | col %in% row\n        (Intercept) Residual\nStdDev:    1.494696 0.628399\n\nFixed effects:  yield ~ trt \n            Value Std.Error DF   t-value p-value\n(Intercept)  6.84 0.9419764 16  7.261328  0.0000\ntrtB        -0.38 1.0254756 16 -0.370560  0.7158\ntrtC         6.28 1.0254756 16  6.123987  0.0000\ntrtD         1.12 1.0254756 16  1.092176  0.2909\ntrtE        -1.92 1.0254756 16 -1.872302  0.0796\n Correlation: \n     (Intr) trtB   trtC   trtD  \ntrtB -0.544                     \ntrtC -0.544  0.500              \ntrtD -0.544  0.500  0.500       \ntrtE -0.544  0.500  0.500  0.500\n\nStandardized Within-Group Residuals:\n       Min         Q1        Med         Q3        Max \n-0.5686726 -0.2469684 -0.1061146  0.2349101  0.7617205 \n\nNumber of Observations: 25\nNumber of Groups: \n         row col %in% row \n           5           25 \n\n#VarCorr(m1_b)\n\n\n\n\n\n\n10.2.3 Check Model Assumptions\n\nlme4nlme\n\n\n\ncheck_model(m1_a, check = c(\"linearity\", \"normality\"))\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\ncheck_model(m1_b, check = c(\"linearity\", \"normality\"))\n\n\n\n\n\n\n\n\n\n\n\n\n\n10.2.4 Inference\nWe can look look at the analysis of variance for treatment effect on yield using anova() function.\n\nlme4nlme\n\n\n\nanova(m1_a, type = \"1\")\n\nType I Analysis of Variance Table with Satterthwaite's method\n    Sum Sq Mean Sq NumDF DenDF F value    Pr(&gt;F)    \ntrt 196.61  49.152     4    12  21.032 2.366e-05 ***\n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n\n\n\n\nanova(m1_b, type = \"sequential\")\n\n            numDF denDF   F-value p-value\n(Intercept)     1    16 132.38123  &lt;.0001\ntrt             4    16  18.69608  &lt;.0001\n\n\n\n\n\nHere we observed a significant impact on fungicide treatment on crop yield. Let’s have a look at the estimated marginal means of wheat yield with each treatment using emmeans() function.\n\nlme4nlme\n\n\n\nemmeans(m1_a, ~ trt)\n\n trt emmean    SE   df lower.CL upper.CL\n A     6.84 0.942 11.9     4.79     8.89\n B     6.46 0.942 11.9     4.41     8.51\n C    13.12 0.942 11.9    11.07    15.17\n D     7.96 0.942 11.9     5.91    10.01\n E     4.92 0.942 11.9     2.87     6.97\n\nDegrees-of-freedom method: kenward-roger \nConfidence level used: 0.95 \n\n\n\n\n\nemmeans(m1_b, ~ trt)\n\n trt emmean    SE df lower.CL upper.CL\n A     6.84 0.942  4     4.22     9.46\n B     6.46 0.942  4     3.84     9.08\n C    13.12 0.942  4    10.50    15.74\n D     7.96 0.942  4     5.34    10.58\n E     4.92 0.942  4     2.30     7.54\n\nDegrees-of-freedom method: containment \nConfidence level used: 0.95",
+    "text": "10.2 Example Analysis\nLet’s start the analysis firstly by loading the required libraries:\n\nlme4nlme\n\n\n\nlibrary(lme4); library(lmerTest); library(emmeans); library(performance)\nlibrary(dplyr); library(broom.mixed); library(agridat); library(desplot)\n\n\n\n\nlibrary(nlme); library(broom.mixed); library(emmeans); library(performance)\nlibrary(dplyr); library(agridat); library(desplot)\n\n\n\n\nThe data used in this example is extracted from the agridat package. In this experiment, 5 treatments (A = Dusted before rains. B = Dusted after rains. C = Dusted once each week. D = Drifting, once each week. E = Not dusted) were tested to control stem rust in wheat.\n\ndat &lt;- agridat::goulden.latin\n\n\nTable of variables in the data set\n\n\ntrt\ntreatment factor, 5 levels\n\n\nrow\nrow position for each plot\n\n\ncol\ncolumn position for each plot\n\n\nyield\nwheat yield\n\n\n\n\n10.2.1 Data integrity checks\nFirstly, let’s verify the class of variables in the dataset using str() function in base R\n\nstr(dat)\n\n'data.frame':   25 obs. of  4 variables:\n $ trt  : Factor w/ 5 levels \"A\",\"B\",\"C\",\"D\",..: 2 3 4 5 1 4 1 3 2 5 ...\n $ yield: num  4.9 9.3 7.6 5.3 9.3 6.4 4 15.4 7.6 6.3 ...\n $ row  : int  5 4 3 2 1 5 4 3 2 1 ...\n $ col  : int  1 1 1 1 1 2 2 2 2 2 ...\n\n\nHere yield and trt are classified as numeric and factor variables, respectively, as needed. But we need to change ‘row’ and ‘col’ from integer t factor/character.\n\ndat1 &lt;- dat |&gt; \n        mutate(row = as.factor(row),\n               col = as.factor(col))\n\nNext, to verify if the data meets the assumption of the Latin square design let’s plot the field layout for this experiment.\n\n\n\n\n\n\n\n\n\nThis looks great! Here we can see that there are equal number (5) of treatments, rows, and columns. Treatments were randomized in such a way that one treatment doesn’t appear more than once in each row and column.\nNext step is to check if there are any missing values in response variable.\n\napply(dat, 2, function(x) sum(is.na(x)))\n\n  trt yield   row   col \n    0     0     0     0 \n\n\nNo missing values detected in this data set.\nBefore fitting the model, let’s create a histogram of response variable to see if there are extreme values.\n\n\n\n\n\n\nHistogram of the dependent variable.\n\n\n\n\nhist(dat$yield, main = \"\", xlab = \"yield\")\n\n\n\n10.2.2 Model fitting\nHere we will fit a model to evaluate the impact of fungicide treatments on wheat yield with trt as a fixed effect and row & col as a random effect.\nVarCorr(m1_b)\n\nlme4nlme\n\n\n\nm1_a &lt;- lmer(yield ~ trt + (1|row) + (1|col),\n           data = dat1,\n           na.action = na.exclude)\nsummary(m1_a) \n\nLinear mixed model fit by REML. t-tests use Satterthwaite's method [\nlmerModLmerTest]\nFormula: yield ~ trt + (1 | row) + (1 | col)\n   Data: dat1\n\nREML criterion at convergence: 89.8\n\nScaled residuals: \n    Min      1Q  Median      3Q     Max \n-1.3994 -0.5383 -0.1928  0.5220  1.8429 \n\nRandom effects:\n Groups   Name        Variance Std.Dev.\n row      (Intercept) 1.8660   1.3660  \n col      (Intercept) 0.2336   0.4833  \n Residual             2.3370   1.5287  \nNumber of obs: 25, groups:  row, 5; col, 5\n\nFixed effects:\n            Estimate Std. Error      df t value Pr(&gt;|t|)    \n(Intercept)   6.8400     0.9420 11.9446   7.261 1.03e-05 ***\ntrtB         -0.3800     0.9669 12.0000  -0.393   0.7012    \ntrtC          6.2800     0.9669 12.0000   6.495 2.96e-05 ***\ntrtD          1.1200     0.9669 12.0000   1.158   0.2692    \ntrtE         -1.9200     0.9669 12.0000  -1.986   0.0704 .  \n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\nCorrelation of Fixed Effects:\n     (Intr) trtB   trtC   trtD  \ntrtB -0.513                     \ntrtC -0.513  0.500              \ntrtD -0.513  0.500  0.500       \ntrtE -0.513  0.500  0.500  0.500\n\n\n\n\n\nm1_b &lt;- lme(yield ~ trt,\n          random =list(~1|row, ~1|col),\n          data = dat, \n          na.action = na.exclude)\n\nsummary(m1_b)\n\nLinear mixed-effects model fit by REML\n  Data: dat \n       AIC      BIC    logLik\n  106.0974 114.0633 -45.04872\n\nRandom effects:\n Formula: ~1 | row\n        (Intercept)\nStdDev:    1.344469\n\n Formula: ~1 | col %in% row\n        (Intercept) Residual\nStdDev:    1.494696 0.628399\n\nFixed effects:  yield ~ trt \n            Value Std.Error DF   t-value p-value\n(Intercept)  6.84 0.9419764 16  7.261328  0.0000\ntrtB        -0.38 1.0254756 16 -0.370560  0.7158\ntrtC         6.28 1.0254756 16  6.123987  0.0000\ntrtD         1.12 1.0254756 16  1.092176  0.2909\ntrtE        -1.92 1.0254756 16 -1.872302  0.0796\n Correlation: \n     (Intr) trtB   trtC   trtD  \ntrtB -0.544                     \ntrtC -0.544  0.500              \ntrtD -0.544  0.500  0.500       \ntrtE -0.544  0.500  0.500  0.500\n\nStandardized Within-Group Residuals:\n       Min         Q1        Med         Q3        Max \n-0.5686726 -0.2469684 -0.1061146  0.2349101  0.7617205 \n\nNumber of Observations: 25\nNumber of Groups: \n         row col %in% row \n           5           25 \n\n\n\n\n\n\n\n10.2.3 Check Model Assumptions\nThis step involves inspection of model residuals. by using check_model() function from the “performance” package.\n\nlme4nlme\n\n\n\ncheck_model(m1_a, check = c(\"linearity\", \"normality\"))\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\ncheck_model(m1_b, check = c(\"linearity\", \"normality\"))\n\n\n\n\n\n\n\n\n\n\n\nThese visuals imply that assumptions of linear model have been met.\n\n\n10.2.4 Inference\nWe can now proceed to the variance partioning. In this case, we will use anova() with type = 1 or type = \"sequesntial\" for lmer() and lme() models, respectively.\n\nlme4nlme\n\n\n\nanova(m1_a, type = \"1\")\n\nType I Analysis of Variance Table with Satterthwaite's method\n    Sum Sq Mean Sq NumDF DenDF F value    Pr(&gt;F)    \ntrt 196.61  49.152     4    12  21.032 2.366e-05 ***\n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n\n\n\n\nanova(m1_b, type = \"sequential\")\n\n            numDF denDF   F-value p-value\n(Intercept)     1    16 132.38123  &lt;.0001\ntrt             4    16  18.69608  &lt;.0001\n\n\n\n\n\nBoth models have detected a significant treatment effect. Here we observed a significant impact on fungicide treatment on crop yield. Let’s have a look at the estimated marginal means of wheat yield with each treatment using emmeans() function.\n\nlme4nlme\n\n\n\nemmeans(m1_a, ~ trt)\n\n trt emmean    SE   df lower.CL upper.CL\n A     6.84 0.942 11.9     4.79     8.89\n B     6.46 0.942 11.9     4.41     8.51\n C    13.12 0.942 11.9    11.07    15.17\n D     7.96 0.942 11.9     5.91    10.01\n E     4.92 0.942 11.9     2.87     6.97\n\nDegrees-of-freedom method: kenward-roger \nConfidence level used: 0.95 \n\n\n\n\n\nemmeans(m1_b, ~ trt)\n\n trt emmean    SE df lower.CL upper.CL\n A     6.84 0.942  4     4.22     9.46\n B     6.46 0.942  4     3.84     9.08\n C    13.12 0.942  4    10.50    15.74\n D     7.96 0.942  4     5.34    10.58\n E     4.92 0.942  4     2.30     7.54\n\nDegrees-of-freedom method: containment \nConfidence level used: 0.95 \n\n\n\n\n\nWe see that wheat yield was higher with ‘C’ fungicide treatment compared to other fungicides applied in this study. Which implies that ‘C’ fungicide was more efficient in controlling the stem rust in wheat.",
     "crumbs": [
       "Experiment designs",
       "<span class='chapter-number'>10</span>  <span class='chapter-title'>Latin Square Design</span>"
@@ -797,7 +797,7 @@
     "href": "chapters/latin-design.html#background",
     "title": "10  Latin Square Design",
     "section": "",
-    "text": "Any additional extraneous sources of variability tend to inflate the error term, making it more difficult to detect differences among the treatment means.\nThe effect of each treatment on the response must be approximately the same across rows and columns.",
+    "text": "The design is particularly appropriate for comparing t treatment means in the presence of two sources of extraneous variation, each measured at t levels.\nThe analysis is quite simple.\n\n\n\nA Latin square can be constructed for any value of t, however, it is best suited for comparing t treatments when 5≤ t≤ 10.\nAny additional extraneous sources of variability tend to inflate the error term, making it more difficult to detect differences among the treatment means.\nThe effect of each treatment on the response must be approximately same across the rows and columns.",
     "crumbs": [
       "Experiment designs",
       "<span class='chapter-number'>10</span>  <span class='chapter-title'>Latin Square Design</span>"
@@ -953,5 +953,16 @@
     "crumbs": [
       "<span class='chapter-number'>12</span>  <span class='chapter-title'>Marginal Means and Contrasts</span>"
     ]
+  },
+  {
+    "objectID": "chapters/incomplete-block-design.html#examples-analyses",
+    "href": "chapters/incomplete-block-design.html#examples-analyses",
+    "title": "9  Incomplete Block Design",
+    "section": "9.2 Examples Analyses",
+    "text": "9.2 Examples Analyses\n\n9.2.1 Balanced Incomplete Block Design\nWe will demonstrate an example data set designed in a balanced incomplete block design. First, load the libraries required for analysis and estimation.\n\nlme4nlme\n\n\n\nlibrary(lme4); library(lmerTest); library(emmeans)\nlibrary(dplyr); library(broom.mixed); library(performance)\n\n\n\n\nlibrary(nlme); library(broom.mixed); library(emmeans)\nlibrary(dplyr); library(performance)\n\n\n\n\nThe data used for this example analysis was extracted from the agridat package. This example is comprised of soybean balanced incomplete block experiment.\n\ndat &lt;- agridat::weiss.incblock\n\n\nTable of variables in the data set\n\n\nblock\nblocking unit\n\n\ngen\ngenotype (variety) factor\n\n\nrow\nrow position for each plot\n\n\ncol\ncolumn position for each plot\n\n\nyield\ngrain yield in bu/ac\n\n\n\n\n\n\n\n\n\n\n\n\n\n9.2.1.1 Data integrity checks\nWe will start inspecting the data set firstly by looking at the class of each variable:\n\nstr(dat)\n\n'data.frame':   186 obs. of  5 variables:\n $ block: Factor w/ 31 levels \"B01\",\"B02\",\"B03\",..: 1 2 3 4 5 6 7 8 9 10 ...\n $ gen  : Factor w/ 31 levels \"G01\",\"G02\",\"G03\",..: 24 15 20 18 20 5 22 1 9 14 ...\n $ yield: num  29.8 24.2 30.5 20 35.2 25 23.6 23.6 29.3 25.5 ...\n $ row  : int  42 36 30 24 18 12 6 42 36 30 ...\n $ col  : int  1 1 1 1 1 1 1 2 2 2 ...\n\n\nThe variables we need for the model are block, genand yield. The block and gen are classified as factor variables and yield is numeric. Therefore, we do not need to change class of any of the required variables.\nNext, let’s check the independent variables. We can look at this by running a cross tabulations among block and gen factors.\n\nagg_tbl &lt;- dat %&gt;% group_by(gen) %&gt;% \n  summarise(total_count=n(),\n            .groups = 'drop')\nagg_tbl\n\n# A tibble: 31 × 2\n   gen   total_count\n   &lt;fct&gt;       &lt;int&gt;\n 1 G01             6\n 2 G02             6\n 3 G03             6\n 4 G04             6\n 5 G05             6\n 6 G06             6\n 7 G07             6\n 8 G08             6\n 9 G09             6\n10 G10             6\n# ℹ 21 more rows\n\n\n\nagg_df &lt;- aggregate(dat$gen, by=list(dat$block), FUN=length)\nagg_df\n\n   Group.1 x\n1      B01 6\n2      B02 6\n3      B03 6\n4      B04 6\n5      B05 6\n6      B06 6\n7      B07 6\n8      B08 6\n9      B09 6\n10     B10 6\n11     B11 6\n12     B12 6\n13     B13 6\n14     B14 6\n15     B15 6\n16     B16 6\n17     B17 6\n18     B18 6\n19     B19 6\n20     B20 6\n21     B21 6\n22     B22 6\n23     B23 6\n24     B24 6\n25     B25 6\n26     B26 6\n27     B27 6\n28     B28 6\n29     B29 6\n30     B30 6\n31     B31 6\n\n\nThere are 31 varieties (levels of gen) and it is perfectly balanced, with exactly one observation per treatment per block.\nWe can calculate the sum of missing values in variables in this data set to evaluate the extent of missing values in different variables:\n\napply(dat, 2, function(x) sum(is.na(x)))\n\nblock   gen yield   row   col \n    0     0     0     0     0 \n\n\nNo missing data!\nLast, let’s plot a histogram of the dependent variable. This is a quick check before analysis to see if there is any strong deviation in values.\n\n\n\n\n\n\n\n\n\nFigure 9.1: Histogram of the dependent variable.\n\n\n\n\n\nhist(dat$yield, main = \"\", xlab = \"yield\")\n\nResponse variable values fall within expected range, with few extreme values on right tail. This data set is ready for analysis!\n\n\n9.2.1.2 Model Building\nWe will be evaluating the response of yield as affected by gen (fixed effect) and block (random effect).\n\n\nPlease note that incomplete block effect can be analyzed as a fixed (intra-block analysis) or a random (inter-block analysis) effect. When we consider block as a random effect, the mean values of a block also contain information about the treatment effects.\n\nlme4nlme\n\n\n\nmodel_icbd &lt;- lmer(yield ~ gen + (1|block),\n                   data = dat, \n                   na.action = na.exclude)\ntidy(model_icbd)\n\n# A tibble: 33 × 8\n   effect group term        estimate std.error statistic    df  p.value\n   &lt;chr&gt;  &lt;chr&gt; &lt;chr&gt;          &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt; &lt;dbl&gt;    &lt;dbl&gt;\n 1 fixed  &lt;NA&gt;  (Intercept)  24.6        0.922   26.7     153. 2.30e-59\n 2 fixed  &lt;NA&gt;  genG02        2.40       1.17     2.06    129. 4.17e- 2\n 3 fixed  &lt;NA&gt;  genG03        8.04       1.17     6.88    129. 2.31e-10\n 4 fixed  &lt;NA&gt;  genG04        2.37       1.17     2.03    129. 4.42e- 2\n 5 fixed  &lt;NA&gt;  genG05        1.60       1.17     1.37    129. 1.73e- 1\n 6 fixed  &lt;NA&gt;  genG06        7.39       1.17     6.32    129. 3.82e- 9\n 7 fixed  &lt;NA&gt;  genG07       -0.419      1.17    -0.359   129. 7.20e- 1\n 8 fixed  &lt;NA&gt;  genG08        3.04       1.17     2.60    129. 1.04e- 2\n 9 fixed  &lt;NA&gt;  genG09        4.84       1.17     4.14    129. 6.22e- 5\n10 fixed  &lt;NA&gt;  genG10       -0.0429     1.17    -0.0367  129. 9.71e- 1\n# ℹ 23 more rows\n\n\n\n\n\nmodel_icbd1 &lt;- lme(yield ~ gen,\n                  random = ~ 1|block,\n                  data = dat, \n                  na.action = na.exclude)\ntidy(model_icbd1)\n\n# A tibble: 33 × 8\n   effect group term        estimate std.error    df statistic  p.value\n   &lt;chr&gt;  &lt;chr&gt; &lt;chr&gt;          &lt;dbl&gt;     &lt;dbl&gt; &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;\n 1 fixed  &lt;NA&gt;  (Intercept)  24.6        0.922   125   26.7    2.10e-53\n 2 fixed  &lt;NA&gt;  genG02        2.40       1.17    125    2.06   4.18e- 2\n 3 fixed  &lt;NA&gt;  genG03        8.04       1.17    125    6.88   2.54e-10\n 4 fixed  &lt;NA&gt;  genG04        2.37       1.17    125    2.03   4.43e- 2\n 5 fixed  &lt;NA&gt;  genG05        1.60       1.17    125    1.37   1.73e- 1\n 6 fixed  &lt;NA&gt;  genG06        7.39       1.17    125    6.32   4.11e- 9\n 7 fixed  &lt;NA&gt;  genG07       -0.419      1.17    125   -0.359  7.20e- 1\n 8 fixed  &lt;NA&gt;  genG08        3.04       1.17    125    2.60   1.04e- 2\n 9 fixed  &lt;NA&gt;  genG09        4.84       1.17    125    4.14   6.33e- 5\n10 fixed  &lt;NA&gt;  genG10       -0.0429     1.17    125   -0.0367 9.71e- 1\n# ℹ 23 more rows\n\n\n\n\n\n\n\n9.2.1.3 Check Model Assumptions\nLet’s verify the assumption of linear mixed models including normal distribution and constant variance of residuals.\n\nlme4nlme\n\n\n\ncheck_model(model_icbd, check = c('normality', 'linearity'))\n\n\n\n\n\n\n\n\n\n\n\ncheck_model(model_icbd1, check = c('normality', 'linearity'))\n\n\n\n\n\n\n\n\n\n\n\n\n\nHere we observed a right skewness in residuals, this can be resolved by using data transformation e.g. log transformation of response variable. Please refer to chapter to read more about data transformation.\n\n\n9.2.1.4 Inference\nWe can extract information about ANOVA using anova().\n\nlme4nlme\n\n\n\nanova(model_icbd, type = \"1\")\n\nType I Analysis of Variance Table with Satterthwaite's method\n    Sum Sq Mean Sq NumDF  DenDF F value    Pr(&gt;F)    \ngen 1901.1  63.369    30 129.06  17.675 &lt; 2.2e-16 ***\n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n\n\n\n\nanova(model_icbd1, type = \"sequential\")\n\n            numDF denDF  F-value p-value\n(Intercept)     1   125 4042.016  &lt;.0001\ngen            30   125   17.675  &lt;.0001\n\n\n\n\n\nLet’s look at the estimated marginal means of yield for each variety (gen).\n\nlme4nlme\n\n\n\nemmeans(model_icbd, ~ gen)\n\n gen emmean    SE  df lower.CL upper.CL\n G01   24.6 0.923 153     22.7     26.4\n G02   27.0 0.923 153     25.2     28.8\n G03   32.6 0.923 153     30.8     34.4\n G04   26.9 0.923 153     25.1     28.8\n G05   26.2 0.923 153     24.4     28.0\n G06   32.0 0.923 153     30.1     33.8\n G07   24.2 0.923 153     22.3     26.0\n G08   27.6 0.923 153     25.8     29.4\n G09   29.4 0.923 153     27.6     31.2\n G10   24.5 0.923 153     22.7     26.4\n G11   27.1 0.923 153     25.2     28.9\n G12   29.3 0.923 153     27.4     31.1\n G13   29.9 0.923 153     28.1     31.8\n G14   24.2 0.923 153     22.4     26.1\n G15   26.1 0.923 153     24.3     27.9\n G16   25.9 0.923 153     24.1     27.8\n G17   19.7 0.923 153     17.9     21.5\n G18   25.7 0.923 153     23.9     27.5\n G19   29.0 0.923 153     27.2     30.9\n G20   33.2 0.923 153     31.3     35.0\n G21   31.1 0.923 153     29.3     32.9\n G22   25.2 0.923 153     23.3     27.0\n G23   29.8 0.923 153     28.0     31.6\n G24   33.6 0.923 153     31.8     35.5\n G25   27.0 0.923 153     25.2     28.8\n G26   27.1 0.923 153     25.3     29.0\n G27   23.8 0.923 153     22.0     25.6\n G28   26.5 0.923 153     24.6     28.3\n G29   24.8 0.923 153     22.9     26.6\n G30   36.2 0.923 153     34.4     38.0\n G31   27.1 0.923 153     25.3     28.9\n\nDegrees-of-freedom method: kenward-roger \nConfidence level used: 0.95 \n\n\n\n\n\nemmeans(model_icbd1, ~ gen)\n\n gen emmean    SE df lower.CL upper.CL\n G01   24.6 0.922 30     22.7     26.5\n G02   27.0 0.922 30     25.1     28.9\n G03   32.6 0.922 30     30.7     34.5\n G04   26.9 0.922 30     25.1     28.8\n G05   26.2 0.922 30     24.3     28.1\n G06   32.0 0.922 30     30.1     33.8\n G07   24.2 0.922 30     22.3     26.0\n G08   27.6 0.922 30     25.7     29.5\n G09   29.4 0.922 30     27.5     31.3\n G10   24.5 0.922 30     22.6     26.4\n G11   27.1 0.922 30     25.2     28.9\n G12   29.3 0.922 30     27.4     31.1\n G13   29.9 0.922 30     28.1     31.8\n G14   24.2 0.922 30     22.4     26.1\n G15   26.1 0.922 30     24.2     28.0\n G16   25.9 0.922 30     24.0     27.8\n G17   19.7 0.922 30     17.8     21.6\n G18   25.7 0.922 30     23.8     27.6\n G19   29.0 0.922 30     27.2     30.9\n G20   33.2 0.922 30     31.3     35.0\n G21   31.1 0.922 30     29.2     33.0\n G22   25.2 0.922 30     23.3     27.1\n G23   29.8 0.922 30     27.9     31.7\n G24   33.6 0.922 30     31.8     35.5\n G25   27.0 0.922 30     25.1     28.9\n G26   27.1 0.922 30     25.3     29.0\n G27   23.8 0.922 30     21.9     25.7\n G28   26.5 0.922 30     24.6     28.4\n G29   24.8 0.922 30     22.9     26.6\n G30   36.2 0.922 30     34.3     38.1\n G31   27.1 0.922 30     25.2     29.0\n\nDegrees-of-freedom method: containment \nConfidence level used: 0.95 \n\n\n\n\n\n\n\n\n9.2.2 Partially Balanced IBD (Alpha Lattice Design)\nThe statistical model for partially balanced design includes:\n\\[y_{ij(l)} = \\mu + \\alpha_i + \\beta_{i(l)} + \\tau_j + \\epsilon_{ij(l)}\\]\nWhere:\n\\(\\mu\\) = overall experimental mean\n\\(\\alpha\\) = replicate effect (random)\n\\(\\beta\\) = incomplete block effect (random)\n\\(\\tau\\) = treatment effect (fixed)\n\\(\\epsilon_{ij(l)}\\) = intra-block residual\nThe data used in this example is published in Cyclic and Computer Generated Designs (John and Williams 1995). The trial was laid out in an alpha lattice design. This trial data had 24 genotypes (“gen”), 6 incomplete blocks, each replicated 3 times.\nLet’s start analyzing this example first by loading the required libraries for linear mixed models:\n\nlme4nlme\n\n\n\nlibrary(lme4); library(lmerTest); library(emmeans)\nlibrary(dplyr); library(broom.mixed); library(performance)\n\n\n\n\nlibrary(nlme); library(broom.mixed); library(emmeans)\nlibrary(dplyr); library(performance)\n\n\n\n\n\ndata1 &lt;- agridat::john.alpha\n\n\nTable of variables in the data set\n\n\nblock\nincomplete blocking unit\n\n\ngen\ngenotype (variety) factor\n\n\nrow\nrow position for each plot\n\n\ncol\ncolumn position for each plot\n\n\nyield\ngrain yield in tonnes/ha\n\n\n\n\n\n\n\n\n\n\n\n\n\n9.2.2.1 Data integrity checks\nLet’s look into the structure of the data first to verify the class of the variables.\n\nstr(data1)\n\n'data.frame':   72 obs. of  7 variables:\n $ plot : int  1 2 3 4 5 6 7 8 9 10 ...\n $ rep  : Factor w/ 3 levels \"R1\",\"R2\",\"R3\": 1 1 1 1 1 1 1 1 1 1 ...\n $ block: Factor w/ 6 levels \"B1\",\"B2\",\"B3\",..: 1 1 1 1 2 2 2 2 3 3 ...\n $ gen  : Factor w/ 24 levels \"G01\",\"G02\",\"G03\",..: 11 4 5 22 21 10 20 2 23 14 ...\n $ yield: num  4.12 4.45 5.88 4.58 4.65 ...\n $ row  : int  1 2 3 4 5 6 7 8 9 10 ...\n $ col  : int  1 1 1 1 1 1 1 1 1 1 ...\n\n\nNext step is to evaluate the independent variables. First, check the number of treatments per replication (each treatment should be replicated 3 times).\n\nagg_tbl &lt;- data1 %&gt;% group_by(gen) %&gt;% \n  summarise(total_count=n(),\n            .groups = 'drop')\nagg_tbl\n\n# A tibble: 24 × 2\n   gen   total_count\n   &lt;fct&gt;       &lt;int&gt;\n 1 G01             3\n 2 G02             3\n 3 G03             3\n 4 G04             3\n 5 G05             3\n 6 G06             3\n 7 G07             3\n 8 G08             3\n 9 G09             3\n10 G10             3\n# ℹ 14 more rows\n\n\nThis looks balanced, as expected.\nAlso, let’s have a look at the number of times each treatment appear per block.\n\nagg_blk &lt;- aggregate(data1$gen, by=list(data1$block), FUN=length)\nagg_blk\n\n  Group.1  x\n1      B1 12\n2      B2 12\n3      B3 12\n4      B4 12\n5      B5 12\n6      B6 12\n\n\n12 treatments randomly appear in incomplete block. Each incomplete block has same number of treatments.\nLastly, before fitting the model, it’s a good idea to look at the distribution of dependent variable, yield.\n\n\n\n\n\n\n\n\n\nFigure 9.2: Histogram of the dependent variable.\n\n\n\n\n\nhist(data1$yield, main = \"\", xlab = \"yield\")\n\nThe response variables seems to follow a normal distribution curve, with fewer values on extreme lower and higher ends.\n\n\n9.2.2.2 Model Building\n\nlme4nlme\n\n\n\nmod_alpha &lt;- lmer(yield ~ gen + (1|rep/block),\n                   data = data1, \n                   na.action = na.exclude)\ntidy(mod_alpha)\n\n# A tibble: 27 × 8\n   effect group term        estimate std.error statistic    df     p.value\n   &lt;chr&gt;  &lt;chr&gt; &lt;chr&gt;          &lt;dbl&gt;     &lt;dbl&gt;     &lt;dbl&gt; &lt;dbl&gt;       &lt;dbl&gt;\n 1 fixed  &lt;NA&gt;  (Intercept)   5.11       0.276    18.5    6.19 0.00000118 \n 2 fixed  &lt;NA&gt;  genG02       -0.629      0.269    -2.34  38.2  0.0248     \n 3 fixed  &lt;NA&gt;  genG03       -1.61       0.268    -6.00  37.7  0.000000590\n 4 fixed  &lt;NA&gt;  genG04       -0.618      0.268    -2.30  37.7  0.0269     \n 5 fixed  &lt;NA&gt;  genG05       -0.0705     0.258    -0.274 34.8  0.786      \n 6 fixed  &lt;NA&gt;  genG06       -0.571      0.268    -2.13  37.7  0.0398     \n 7 fixed  &lt;NA&gt;  genG07       -0.997      0.258    -3.87  34.8  0.000457   \n 8 fixed  &lt;NA&gt;  genG08       -0.580      0.268    -2.16  37.7  0.0370     \n 9 fixed  &lt;NA&gt;  genG09       -1.61       0.258    -6.21  35.3  0.000000390\n10 fixed  &lt;NA&gt;  genG10       -0.735      0.259    -2.83  35.9  0.00754    \n# ℹ 17 more rows\n\n\n\n\n\nmod_alpha1 &lt;- lme(yield ~ gen,\n                  random = ~ 1|rep/block,\n                  data = data1, \n                  na.action = na.exclude)\ntidy(mod_alpha1)\n\nWarning in tidy.lme(mod_alpha1): ran_pars not yet implemented for multiple\nlevels of nesting\n\n\n# A tibble: 24 × 7\n   effect term        estimate std.error    df statistic  p.value\n   &lt;chr&gt;  &lt;chr&gt;          &lt;dbl&gt;     &lt;dbl&gt; &lt;dbl&gt;     &lt;dbl&gt;    &lt;dbl&gt;\n 1 fixed  (Intercept)   5.11       0.276    31    18.5   2.63e-18\n 2 fixed  genG02       -0.629      0.269    31    -2.34  2.61e- 2\n 3 fixed  genG03       -1.61       0.268    31    -6.00  1.23e- 6\n 4 fixed  genG04       -0.618      0.268    31    -2.30  2.81e- 2\n 5 fixed  genG05       -0.0705     0.258    31    -0.274 7.86e- 1\n 6 fixed  genG06       -0.571      0.268    31    -2.13  4.12e- 2\n 7 fixed  genG07       -0.997      0.258    31    -3.87  5.23e- 4\n 8 fixed  genG08       -0.580      0.268    31    -2.16  3.84e- 2\n 9 fixed  genG09       -1.61       0.258    31    -6.21  6.71e- 7\n10 fixed  genG10       -0.735      0.259    31    -2.83  8.05e- 3\n# ℹ 14 more rows\n\n\n\n\n\n\n\n9.2.2.3 Check Model Assumptions\nLet’s verify the assumption of linear mixed models including normal distribution and constant variance of residuals.\n\nlme4nlme\n\n\n\ncheck_model(mod_alpha, check = c('normality', 'linearity'))\n\n\n\n\n\n\n\n\n\n\n\ncheck_model(mod_alpha1, check = c('normality', 'linearity'))\n\n\n\n\n\n\n\n#check_model(model_lme, check = c('normality', 'linearity'))\n\n\n\n\n\n\n9.2.2.4 Inference\nLet’s ANOVA table using anova() from lmer and lme models, respectively.\n\nlme4nlme\n\n\n\nanova(mod_alpha, type = \"1\")\n\nType I Analysis of Variance Table with Satterthwaite's method\n    Sum Sq Mean Sq NumDF  DenDF F value    Pr(&gt;F)    \ngen 10.679 0.46429    23 34.902  5.4478 4.229e-06 ***\n---\nSignif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n\n\n\n\n\nanova(mod_alpha1, type = \"sequential\")\n\n            numDF denDF  F-value p-value\n(Intercept)     1    31 470.9507  &lt;.0001\ngen            23    31   5.4478  &lt;.0001\n\n\n\n\n\nLet’s look at the estimated marginal means of yield for each variety (gen).\n\nlme4nlme\n\n\n\nemmeans(mod_alpha, ~ gen)\n\n gen emmean    SE   df lower.CL upper.CL\n G01   5.11 0.279 6.20     4.43     5.78\n G02   4.48 0.279 6.20     3.80     5.15\n G03   3.50 0.279 6.20     2.82     4.18\n G04   4.49 0.279 6.20     3.81     5.17\n G05   5.04 0.278 6.19     4.36     5.71\n G06   4.54 0.278 6.19     3.86     5.21\n G07   4.11 0.279 6.20     3.43     4.79\n G08   4.53 0.279 6.20     3.85     5.20\n G09   3.50 0.278 6.19     2.83     4.18\n G10   4.37 0.279 6.20     3.70     5.05\n G11   4.28 0.279 6.20     3.61     4.96\n G12   4.76 0.279 6.20     4.08     5.43\n G13   4.76 0.278 6.19     4.08     5.43\n G14   4.78 0.278 6.19     4.10     5.45\n G15   4.97 0.278 6.19     4.29     5.65\n G16   4.73 0.279 6.20     4.05     5.41\n G17   4.60 0.278 6.19     3.93     5.28\n G18   4.36 0.279 6.20     3.69     5.04\n G19   4.84 0.278 6.19     4.16     5.52\n G20   4.04 0.278 6.19     3.36     4.72\n G21   4.80 0.278 6.19     4.12     5.47\n G22   4.53 0.278 6.19     3.85     5.20\n G23   4.25 0.278 6.19     3.58     4.93\n G24   4.15 0.279 6.20     3.48     4.83\n\nDegrees-of-freedom method: kenward-roger \nConfidence level used: 0.95 \n\n\n\n\n\nemmeans(mod_alpha1, ~ gen)\n\n gen emmean    SE df lower.CL upper.CL\n G01   5.11 0.276  2     3.92     6.30\n G02   4.48 0.276  2     3.29     5.67\n G03   3.50 0.276  2     2.31     4.69\n G04   4.49 0.276  2     3.30     5.68\n G05   5.04 0.276  2     3.85     6.22\n G06   4.54 0.276  2     3.35     5.72\n G07   4.11 0.276  2     2.92     5.30\n G08   4.53 0.276  2     3.34     5.72\n G09   3.50 0.276  2     2.31     4.69\n G10   4.37 0.276  2     3.19     5.56\n G11   4.28 0.276  2     3.10     5.47\n G12   4.76 0.276  2     3.57     5.94\n G13   4.76 0.276  2     3.57     5.95\n G14   4.78 0.276  2     3.59     5.96\n G15   4.97 0.276  2     3.78     6.16\n G16   4.73 0.276  2     3.54     5.92\n G17   4.60 0.276  2     3.42     5.79\n G18   4.36 0.276  2     3.17     5.55\n G19   4.84 0.276  2     3.65     6.03\n G20   4.04 0.276  2     2.85     5.23\n G21   4.80 0.276  2     3.61     5.98\n G22   4.53 0.276  2     3.34     5.72\n G23   4.25 0.276  2     3.06     5.44\n G24   4.15 0.276  2     2.97     5.34\n\nDegrees-of-freedom method: containment \nConfidence level used: 0.95 \n\n\n\n\n\n\n\n\n\nJohn, JA, and ER Williams. 1995. Cyclic and Computer Generated Designs. 2nd ed. New York: Chapman; Hall/CRC Press. https://doi.org/10.1201/b15075.\n\n\nPatterson, H. D., and E. R. Williams. 1976. “A New Class of Resolvable Incomplete Block Designs.” Biometrika 63 (1): 83–92. https://doi.org/10.2307/2335087.\n\n\nYates, F. 1936. “A New Method of Arranging Variety Trials Involving a Large Number of Varieties.” J Agric Sci 26: 424–55.",
+    "crumbs": [
+      "Experiment designs",
+      "<span class='chapter-number'>9</span>  <span class='chapter-title'>Incomplete Block Design</span>"
+    ]
   }
 ]
\ No newline at end of file