-
Notifications
You must be signed in to change notification settings - Fork 0
/
ancova presentation Audrey.Rmd
218 lines (191 loc) · 8.59 KB
/
ancova presentation Audrey.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
---
title: "Replication of Ancova studies"
author: "Audrey Yeo Te-ying"
date: "`r date()`"
output: ioslides_presentation
---
```{r, eval=TRUE, echo=FALSE, include = FALSE}
#Setting up libraries
library(readxl)
library(foreign)
library(readxl)
library(readstata13)
library(car) # for levene's test
library(effects)# for adjusted means
library(ggplot2) # for graphs
library(multcomp) #for post hoc test
library(pastecs)# for descriptive statistics
library(reshape)
library(effects)# for effect command
library(ggplot2)
library(beeswarm)
```
```{r setup, echo=FALSE, eval=TRUE, include= FALSE}
knitr::opts_chunk$set(echo = FALSE)
#--------Viagra data----------
libido<-c(3,2,5,2,2,2,7,2,4,7,5,3,4,4,7,5,4,9,2,6,3,4,4,4,6,4,6,2,8,5)
partnerLibido<-c(4,1,5,1,2,2,7,4,5,5,3,1,2,2,6,4,2,1,3,5,4,3,3,2,0,1,3,0,1,0)
dose<-c(rep(1,9),rep(2,8), rep(3,13))
dose<-factor(dose, levels = c(1:3), labels = c("Placebo", "Low Dose", "High Dose"))
viagraData<-data.frame(dose, libido, partnerLibido)
```
# Ancova
## Ancova is not Anova
<p align="left">
<b> Definitions </b>
</p>
<p align="left">
- The rationale for ANCOVA is two folds:
1. Reduce within-group error variance
2. Eliminate confounders
- An ANOVA is a type of linear regression model, an ANCOVA is an ANOVA, which accounts for potential confounders which are usually continuous variables
- An ANCOVA model is the analysis of covariance where the variance of the outcome is explained by discrete variable (Independant variable) and covariates which are continuous variables.
</p>
## Motivation
<b> Libido Study </b>
<p align="left">
- Initial one-way ANOVA show that there is no evidence of significant relationship between dose and libido, motivates to evaluate its underlying covariances to answer :
- What is the variance explained by three doses of Viagra on Libido?
- And what is the variance explained by Partner Libido on Libido.
</p>
```{r, include = TRUE, echo = TRUE, eval = TRUE, tidy = TRUE}
first_mod <- aov( libido ~dose, data=viagraData)
summary(first_mod)
```
## Explore Data
<p align="left">
- Boxplot shows: Does different doses produce different outcomes
</p>
```{r, echo= FALSE, include = TRUE, eval = TRUE, tidy = TRUE}
#viagraData
par(mfrow=c(1,2))
beeswarm(libido~dose, data= viagraData, main = "Libido", pch = 16, col = rainbow(8))
boxplot(libido~dose, data= viagraData, outline = FALSE, main = "Libido") # add = TRUE sometimes work to superimpose both plots
beeswarm(partnerLibido~dose, data= viagraData, pch = 16, col = rainbow(8), main = "Partner Libido")
boxplot(partnerLibido ~ dose, data = viagraData, outline = FALSE, main = "Partner Libido", add = TRUE) # add = TRUE sometimes work to superimpose both plots
```
## Explore Data : Descriptives
<p align="left">
- Descriptive Statistics
</p>
```{r, include = TRUE, echo = TRUE, eval = TRUE, tidy = TRUE}
by(viagraData$libido, viagraData$dose, stat.desc)
by(viagraData$partnerLibido, viagraData$dose, stat.desc)
```
## Explore Data : Graphical analysis for homogeneity of regression slopes
<p align="left">
- Say we want to know the overall relationship between Viagra and Partner libido, ignoring which dose groups the data points belong to, we therefore assume that the relationship between dose group and partner libido is constant across all doses. If the latter is not true, then the overall relationship between Viagra and Partner libido is inaccurate.
- there is homogeneity of regression slopes for Placebo and Lower doses
- there is no homogeneity of regression slopes for Placebo and High dose
- That's ok because there are situations where you might actually expect regression slopes to differ across groups and that this is, in itself, an interesting hypothesis.
</p>
## Graphical Anaylsis for homogeneity of regression slopes
```{r, include = TRUE, echo = FALSE, eval = TRUE, tidy = TRUE}
ggplot(viagraData, aes(partnerLibido, libido, colour = dose)) +
geom_jitter(aes(colour = factor(dose))) +
geom_smooth(method = lm)
```
## Assumption 1 : Homogeneity of variance
<p align="left">
- Levene's Test: Are the variances between doses on libido very similar ?
- A good double-check of Levene’s test is to look at the highest and lowest variances. For our three groups we have standard deviations of 1.79 (placebo), 1.46 (low dose) and 2.12 (high dose)
</p>
```{r, include = TRUE, echo = TRUE, eval = TRUE, tidy = TRUE}
leveneTest(viagraData$libido, viagraData$dose, center = median)
```
## Assumption 2 : Check that the covariate and any independent variables are independent
<p align="left">
- We run an ANOVA to find out if the covariance independant to each of the three doses?
- The summary results show that on low and high doses, there is no evidence of significant association of these doses to partner libido.
</p>
```{r, eval = TRUE, echo = TRUE}
checkIndependenceModel<-aov(partnerLibido ~ dose, data = viagraData)
summary(checkIndependenceModel)
summary.lm(checkIndependenceModel)
```
## Running the ANCOVA
<p align="left">
</p>
```{r, include = TRUE, echo = TRUE, eval = TRUE, tidy = TRUE}
ancova_mod <- aov(libido ~dose + partnerLibido, data= viagraData)
summary(ancova_mod)
```
## Interpreting the main effects of ANCOVA
<p align="left">
- We run the ANCOVA model and compute the variance per dose group
- The command aov will calculate the ANCOVA, though it is also used with ANOVA
- The Anova command shows the same output of the summary.lm command. The former specifies the Type III sum of squares used to compute the analyses variances
- The Anova results of the viagraModel here show that there is evidence of significant effect of partner libido on libido.
</p>
## Main effects of ANCOVA
```{r, include = TRUE, echo = FALSE, eval = TRUE, tidy = TRUE}
contrasts(viagraData$dose)<-cbind(c(-2,1,1), c(0,-1,1))
viagraModel<-aov(libido~ partnerLibido + dose, data = viagraData)
summary.lm(viagraModel)
Anova(viagraModel, type="III")
#
# object<-effect(partnerlibido, viagraModel, se=TRUE)
# summary(object)
# object$se
#
# adjustedMeans<-effect("dose", viagraModel, se=TRUE)
# summary(adjustedMeans)
# adjustedMeans$se
```
## Main effects of ANCOVA II
```{r, include = TRUE, echo = FALSE, eval = TRUE, tidy = TRUE}
Anova(viagraModel, type="III")
#
# object<-effect(partnerlibido, viagraModel, se=TRUE)
# summary(object)
# object$se
#
# adjustedMeans<-effect("dose", viagraModel, se=TRUE)
# summary(adjustedMeans)
# adjustedMeans$se
```
## Residuals vs Fitted : Testing for Homogeneity of Variance
<p align="left">
- The residuals for fitted value are similar to zero
</p>
```{r, include = TRUE, echo = FALSE, eval = TRUE, tidy= TRUE, fig.width = 6, fig.height = 5}
plot(viagraModel, which = 1)
```
## QQ plots
<p align="left">
- Residuals are normally distributed
- estimate in each case is the difference between the adjusted group means
</p>
```{r, include = TRUE, echo = FALSE, eval = TRUE, tidy =TRUE, fig.width = 4, fig.height = 5}
plot(viagraModel, which = 2)
```
## Post Hoc Tests
<p align="left">
- A Tukey-Ascombe test is performed via a general linear hypotheses function from library(multcomp).
- It is used for a pairwise comparison t test between doses for between groups differences
- In other words: we want to test differences between the adjusted means.
- In this case, the differences between Higher dose and Placebo can be compared with evidence of significance.
- In this method, we are restricted to the Tukey and the Dunnett method.
</p>
```{r, include = TRUE, echo = TRUE, eval = TRUE}
postHocs<-glht(viagraModel, linfct = mcp(dose = "Tukey"))
summary(postHocs)
confint(postHocs)
```
## Going back to Basic Assumptions : Homogeneity of Regression Slopes
<p align="left">
- "The significant between interaction effect and dose show that the assumption is not tenable..
- "Although this finding is not surprising given the pattern of relationships, it does raise concern about the main analysis. This example illustrates why it is important to test assumptions and not to just blindly accept the results of an analysis"
</p>
```{r, include = TRUE, echo = TRUE, eval = TRUE}
hoRS<-aov(libido ~ partnerLibido*dose, data = viagraData)
hoRS<-update(viagraModel, .~. + partnerLibido:dose)
Anova(hoRS, type="III")
```
## Summary and Conclusion
<p align="left">
- ANOVA is not ANCOVA because the latter accounts for analysis of variance of the covariate or covariates
- Interpretation of ANCOVA depends on the order in the aov(...) function
- Basic Assumptions are important because for example the overall conclusion of the relationship between Viagra and Libido must be based on an independance assumption between covariates and dose of Viagra
- Thus the conclusion that Partner libido has a significant influence on Libido under three condition of Viagra dose, is inaccurate.
</p>