Skip to content

Commit d26f6f6

Browse files
committed
New spurious correlation example
1 parent 935f992 commit d26f6f6

File tree

4 files changed

+12
-4
lines changed

4 files changed

+12
-4
lines changed

images/lyme_and_fried_chicken.png

75.8 KB
Loading

images/lyme_and_fried_chicken_map.png

727 KB
Loading

modules/Statistics/Statistics.Rmd

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -216,11 +216,12 @@ corrplot(cor_mat)
216216

217217
## Correlation does not imply causation
218218

219-
```{r, fig.alt="The End", out.width = "90%", echo = FALSE, fig.align='center'}
220-
knitr::include_graphics("https://www.opencasestudies.org/ocs-bp-co2-emissions/img/causation.png")
219+
```{r, fig.alt="Simpson's paradox!", out.width = "90%", echo = FALSE, fig.align='center'}
220+
knitr::include_graphics(here::here("images/lyme_and_fried_chicken_map.png"))
221+
knitr::include_graphics(here::here("images/lyme_and_fried_chicken.png"))
221222
```
222223

223-
[source](http://tylervigen.com/spurious-correlations)
224+
[source](http://doi.org/10.1007/s10393-020-01472-1)
224225

225226

226227
# T-test
@@ -308,6 +309,13 @@ See [here](https://www.nature.com/articles/nbt1209-1135) for more about multiple
308309
- `chisq.test()` -- Chi-squared test
309310
- `aov()` -- Analysis of Variance (ANOVA)
310311

312+
## Summary
313+
314+
- Use `cor()` to calculate correlation between two vectors, `cor.test()` can give more information.
315+
- `corrplot()` is nice for a quick visualization!
316+
- `t.test()` one sample test to test the difference in mean of a single vector from zero (one input)
317+
- `t.test()` two sample test to test the difference in means between two vectors (two inputs)
318+
311319
## Lab Part 1
312320

313321
🏠 [Class Website](https://jhudatascience.org/intro_to_r/)

modules/Statistics/lab/Statistics_Lab.Rmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ library(broom)
2727
2828
```
2929

30-
3. Compute the correlation (with `cor`) between the `1980`, `1990`, `2000`, and `2010` mortality variables. (No need to save this in an object. Just display the result to the screen.) Use `select()` function to first subset the data frame to keep the four columns only. To use a column name in `select()` that starts with a number, surround it with backticks (\`). How does this change when we use the `use = "complete.obs"` argument?
30+
3. Compute the correlation (with `cor`) between the `1980`, `1990`, `2000`, and `2010` mortality variables. (No need to save this in an object. Just display the result to the screen.) Use `select()` function to first subset the data frame to keep the four columns only. To use a column name in `select()` that starts with a number, surround it with backticks. How does this change when we use the `use = "complete.obs"` argument?
3131

3232
```{r}
3333

0 commit comments

Comments
 (0)