diff --git a/content/notes/gapminder/index.Rmd b/content/notes/gapminder/index.Rmd index f6739410..7af7744f 100644 --- a/content/notes/gapminder/index.Rmd +++ b/content/notes/gapminder/index.Rmd @@ -177,25 +177,27 @@ ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, color = conti Why use `facet_grid()` here instead of `facet_wrap()`? Good question! Let's reframe it and instead ask, what is the difference between `facet_grid()` and `facet_wrap()`?^[Example drawn from [this StackOverflow thread](https://stackoverflow.com/questions/20457905/whats-the-difference-between-facet-wrap-and-facet-grid-in-ggplot2).] -The answer below refers to the case when you have 2 arguments in `facet_grid()` or `facet_wrap()`. `facet_grid(x ~ y)` will display $x \times y$ plots even if some plots are empty. For example: +The answer below refers to the case when you have 2 arguments in `facet_grid()` or `facet_wrap()`. `facet_grid(rows = vars(x), cols = vars(y))` will display $y \times x$ plots even if some plots are empty. For example: ```{r facet-grid} -ggplot(mpg, aes(displ, hwy)) + +library(palmerpenguins) + +ggplot(data = penguins, aes(x = bill_length_mm, y = body_mass_g)) + geom_point() + - facet_grid(rows = vars(cyl), cols = vars(class)) + facet_grid(rows = vars(species), cols = vars(island)) ``` -There are 4 distinct `cyl` and 7 distinct `class` values. This plot displays $4 \times 7 = 28$ plots, even if some are empty (because some classes do not have corresponding cylinder values, like rows with `class = "midsize"` doesn't have any corresponding `cyl = 5` value ). +There are 3 distinct `species` and `island` values. This plot displays $3 \times 3 = 9$ plots, even if some are empty (for example, Chinstrap penguins were not observed on Biscoe Island). -`facet_wrap(facets = vars(cyl, class))` displays only the plots having actual values. +`facet_wrap(facets = vars(species, island))` displays only the plots having actual values. ```{r facet-wrap} -ggplot(mpg, aes(displ, hwy)) + +ggplot(data = penguins, aes(x = bill_length_mm, y = body_mass_g)) + geom_point() + - facet_wrap(facets = vars(cyl, class)) + facet_wrap(facets = vars(species, island)) ``` -There are 19 plots displayed now, one for every combination of `cyl` and `class`. So for this exercise, I would use `facet_wrap()` because we are faceting on a single variable. If we faceted on multiple variables, `facet_grid()` may be more appropriate. +There are 5 plots displayed now, one for every combination of `species` and `island`. So for this exercise, I would use `facet_wrap()` because we are faceting on a single variable. If we faceted on multiple variables, `facet_grid()` may be more appropriate. {{< /spoiler >}} diff --git a/content/notes/gapminder/index.md b/content/notes/gapminder/index.md index 6cf7440e..2196df66 100644 --- a/content/notes/gapminder/index.md +++ b/content/notes/gapminder/index.md @@ -240,31 +240,41 @@ ggplot(data = gapminder, mapping = aes(x = gdpPercap, y = lifeExp, color = conti Why use `facet_grid()` here instead of `facet_wrap()`? Good question! Let's reframe it and instead ask, what is the difference between `facet_grid()` and `facet_wrap()`?^[Example drawn from [this StackOverflow thread](https://stackoverflow.com/questions/20457905/whats-the-difference-between-facet-wrap-and-facet-grid-in-ggplot2).] -The answer below refers to the case when you have 2 arguments in `facet_grid()` or `facet_wrap()`. `facet_grid(x ~ y)` will display $x \times y$ plots even if some plots are empty. For example: +The answer below refers to the case when you have 2 arguments in `facet_grid()` or `facet_wrap()`. `facet_grid(rows = vars(x), cols = vars(y))` will display $y \times x$ plots even if some plots are empty. For example: ```r -ggplot(mpg, aes(displ, hwy)) + +library(palmerpenguins) + +ggplot(data = penguins, aes(x = bill_length_mm, y = body_mass_g)) + geom_point() + - facet_grid(rows = vars(cyl), cols = vars(class)) + facet_grid(rows = vars(species), cols = vars(island)) +``` + +``` +## Warning: Removed 2 rows containing missing values (geom_point). ``` -There are 4 distinct `cyl` and 7 distinct `class` values. This plot displays $4 \times 7 = 28$ plots, even if some are empty (because some classes do not have corresponding cylinder values, like rows with `class = "midsize"` doesn't have any corresponding `cyl = 5` value ). +There are 3 distinct `species` and `island` values. This plot displays $3 \times 3 = 9$ plots, even if some are empty (for example, Chinstrap penguins were not observed on Biscoe Island). -`facet_wrap(facets = vars(cyl, class))` displays only the plots having actual values. +`facet_wrap(facets = vars(species, island))` displays only the plots having actual values. ```r -ggplot(mpg, aes(displ, hwy)) + +ggplot(data = penguins, aes(x = bill_length_mm, y = body_mass_g)) + geom_point() + - facet_wrap(facets = vars(cyl, class)) + facet_wrap(facets = vars(species, island)) +``` + +``` +## Warning: Removed 2 rows containing missing values (geom_point). ``` -There are 19 plots displayed now, one for every combination of `cyl` and `class`. So for this exercise, I would use `facet_wrap()` because we are faceting on a single variable. If we faceted on multiple variables, `facet_grid()` may be more appropriate. +There are 5 plots displayed now, one for every combination of `species` and `island`. So for this exercise, I would use `facet_wrap()` because we are faceting on a single variable. If we faceted on multiple variables, `facet_grid()` may be more appropriate. {{< /spoiler >}} @@ -309,89 +319,95 @@ devtools::session_info() ## collate en_US.UTF-8 ## ctype en_US.UTF-8 ## tz America/Chicago -## date 2021-10-11 +## date 2022-01-06 ## ## ─ Packages ─────────────────────────────────────────────────────────────────── -## package * version date lib source -## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) -## backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) -## blogdown 1.4 2021-07-23 [1] CRAN (R 4.1.0) -## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.0) -## broom 0.7.9 2021-07-27 [1] CRAN (R 4.1.0) -## bslib 0.2.5.1 2021-05-18 [1] CRAN (R 4.1.0) -## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) -## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0) -## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0) -## cli 3.0.1 2021-07-17 [1] CRAN (R 4.1.0) -## colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) -## crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.0) -## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) -## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) -## desc 1.3.0 2021-03-05 [1] CRAN (R 4.1.0) -## devtools 2.4.2 2021-06-07 [1] CRAN (R 4.1.0) -## digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0) -## dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) -## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) -## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) -## fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) -## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) -## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0) -## fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) -## generics 0.1.0 2020-10-31 [1] CRAN (R 4.1.0) -## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) -## glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0) -## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) -## haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.0) -## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.0) -## hms 1.1.0 2021-05-17 [1] CRAN (R 4.1.0) -## htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0) -## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) -## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0) -## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) -## knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) -## lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.1.0) -## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0) -## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) -## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) -## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0) -## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) -## pillar 1.6.2 2021-07-29 [1] CRAN (R 4.1.0) -## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.1.0) -## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) -## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.1.0) -## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) -## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) -## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) -## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) -## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) -## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) -## readr * 2.0.1 2021-08-10 [1] CRAN (R 4.1.0) -## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0) -## remotes 2.4.0 2021-06-02 [1] CRAN (R 4.1.0) -## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) -## rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0) -## rmarkdown 2.10 2021-08-06 [1] CRAN (R 4.1.0) -## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0) -## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) -## rvest 1.0.1 2021-07-26 [1] CRAN (R 4.1.0) -## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0) -## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) -## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) -## stringi 1.7.3 2021-07-16 [1] CRAN (R 4.1.0) -## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) -## testthat 3.0.4 2021-07-01 [1] CRAN (R 4.1.0) -## tibble * 3.1.3 2021-07-23 [1] CRAN (R 4.1.0) -## tidyr * 1.1.3 2021-03-03 [1] CRAN (R 4.1.0) -## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) -## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.0) -## tzdb 0.1.2 2021-07-20 [1] CRAN (R 4.1.0) -## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.1.0) -## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) -## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) -## withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) -## xfun 0.25 2021-08-06 [1] CRAN (R 4.1.0) -## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) -## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) +## package * version date lib source +## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) +## backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) +## blogdown 1.7 2021-12-19 [1] CRAN (R 4.1.0) +## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.0) +## broom 0.7.9 2021-07-27 [1] CRAN (R 4.1.0) +## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.0) +## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) +## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0) +## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0) +## cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.0) +## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.0) +## colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) +## crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0) +## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) +## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) +## desc 1.3.0 2021-03-05 [1] CRAN (R 4.1.0) +## devtools 2.4.2 2021-06-07 [1] CRAN (R 4.1.0) +## digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.0) +## dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) +## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) +## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) +## fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) +## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0) +## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) +## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0) +## fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) +## gapminder * 0.3.0 2017-10-31 [1] CRAN (R 4.1.0) +## generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0) +## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) +## glue 1.5.0 2021-11-07 [1] CRAN (R 4.1.0) +## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) +## haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.0) +## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.0) +## highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) +## hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0) +## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0) +## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) +## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0) +## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) +## knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) +## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.0) +## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0) +## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0) +## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) +## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) +## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0) +## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) +## palmerpenguins * 0.1.0 2020-07-23 [1] CRAN (R 4.1.0) +## pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0) +## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.1.0) +## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) +## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.1.0) +## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) +## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) +## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) +## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) +## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) +## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) +## readr * 2.0.2 2021-09-27 [1] CRAN (R 4.1.0) +## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0) +## remotes 2.4.0 2021-06-02 [1] CRAN (R 4.1.0) +## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) +## rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0) +## rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0) +## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0) +## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) +## rvest 1.0.1 2021-07-26 [1] CRAN (R 4.1.0) +## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0) +## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) +## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) +## stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.0) +## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) +## testthat 3.0.4 2021-07-01 [1] CRAN (R 4.1.0) +## tibble * 3.1.6 2021-11-07 [1] CRAN (R 4.1.0) +## tidyr * 1.1.4 2021-09-27 [1] CRAN (R 4.1.0) +## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) +## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.0) +## tzdb 0.1.2 2021-07-20 [1] CRAN (R 4.1.0) +## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.1.0) +## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) +## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) +## withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) +## xfun 0.29 2021-12-14 [1] CRAN (R 4.1.0) +## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) +## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) ## ## [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library ``` diff --git a/content/notes/gapminder/index_files/figure-html/facet-grid-1.png b/content/notes/gapminder/index_files/figure-html/facet-grid-1.png index 04f96756..36ad8b32 100644 Binary files a/content/notes/gapminder/index_files/figure-html/facet-grid-1.png and b/content/notes/gapminder/index_files/figure-html/facet-grid-1.png differ diff --git a/content/notes/gapminder/index_files/figure-html/facet-wrap-1.png b/content/notes/gapminder/index_files/figure-html/facet-wrap-1.png index 50638f82..7adcaf28 100644 Binary files a/content/notes/gapminder/index_files/figure-html/facet-wrap-1.png and b/content/notes/gapminder/index_files/figure-html/facet-wrap-1.png differ diff --git a/content/notes/grammar-of-graphics/index.Rmd b/content/notes/grammar-of-graphics/index.Rmd index 42c9ccec..06dc9681 100644 --- a/content/notes/grammar-of-graphics/index.Rmd +++ b/content/notes/grammar-of-graphics/index.Rmd @@ -15,7 +15,7 @@ menu: --- ```{r setup, include = FALSE} -knitr::opts_chunk$set(cache = TRUE) +knitr::opts_chunk$set(cache = TRUE, warning = FALSE, message = FALSE) options(digits = 3) ``` @@ -28,6 +28,7 @@ This page is a summary of [*A Layered Grammar of Graphics*](http://www-tandfonli ```{r packages, cache = FALSE, message = FALSE} library(tidyverse) library(knitr) +library(palmerpenguins) ``` Google defines a **grammar** as "the whole system and structure of a language or of languages in general, usually taken as consisting of syntax and morphology (including inflections) and sometimes also phonology and semantics".^[[Google](https://www.google.com/search?q=grammar)] Others consider a grammar to be "the fundamental principles or rules of an art or science".^[[Wickham, Hadley. (2010) "A Layered Grammar of Graphics". *Journal of Computational and Graphical Statistics*, 19(1).](http://www.jstor.org.proxy.uchicago.edu/stable/25651297)] Applied to visualizations, a **grammar of graphics** is a grammar used to describe and create a wide range of statistical graphics.^[[Wilkinson, Leland. (2005). *The Grammar of Graphics*. (UChicago authentication required)](http://link.springer.com.proxy.uchicago.edu/book/10.1007%2F0-387-28695-0)] @@ -81,21 +82,21 @@ tibble( **Data** defines the source of the information to be visualized, but is independent from the other elements. So a layered graphic can be built which utilizes different data sources while keeping the other components the same. -For our running example, let's use the `mpg` dataset in the `ggplot2` package.^[Run `?mpg` in the console to get more information about this dataset.] +For our running example, let's use the `penguins` dataset in the [`palmerpenguins`](https://allisonhorst.github.io/palmerpenguins/) package.^[Run `?penguins` in the console to get more information about this dataset.] -```{r mpg} -head(x = mpg) %>% +```{r penguins} +head(x = penguins) %>% kable(caption = "Dataset of automobiles") ``` -**Mapping** defines how the variables are applied to the plot. So if we were graphing information from `mpg`, we might map a car's engine displacement to the $x$ position and highway mileage to the $y$ position. +**Mapping** defines how the variables are applied to the plot. So if we were graphing information from `penguins`, we might map a penguin's flipper length to the $x$ position and body mass to the $y$ position. ```{r mapping} -mpg %>% - select(displ, hwy) %>% +penguins %>% + select(flipper_length_mm, body_mass_g) %>% rename( - x = displ, - y = hwy + x = flipper_length_mm, + y = body_mass_g ) ``` @@ -106,15 +107,15 @@ A **statistical transformation** (*stat*) transforms the data, generally by summ A stat is a function that takes in a dataset as the input and returns a dataset as the output; a stat can add new variables to the original dataset, or create an entirely new dataset. So instead of graphing this data in its raw form: ```{r stat_raw} -mpg %>% - select(cyl) +penguins %>% + select(island) ``` You would transform it to: ```{r stat_transform} -mpg %>% - count(cyl) +penguins %>% + count(island) ``` {{% callout note %}} @@ -134,37 +135,37 @@ Sometimes you don't need to make a statistical transformation. For example, in a Each geom can only display certain **aesthetics** or visual attributes of the geom. For example, a point geom has position, color, shape, and size aesthetics. ```{r geom_point} -ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point() + ggtitle("A point geom with position and color aesthetics") ``` * Position defines where each point is drawn on the plot -* Color defines the color of each point. Here the color is determined by the class of the car (observation) +* Color defines the color of each point. Here the color is determined by the species of the car (observation) Whereas a bar geom has position, height, width, and fill color. ```{r geom_bar} -ggplot(data = mpg, aes(x = cyl)) + +ggplot(data = penguins, aes(x = island)) + geom_bar() + ggtitle("A bar geom with position and height aesthetics") ``` * Position determines the starting location (origin) of each bar -* Height determines how tall to draw the bar. Here the height is based on the number of observations in the dataset for each possible number of cylinders. +* Height determines how tall to draw the bar. Here the height is based on the number of observations in the dataset for each island. ## Position adjustment Sometimes with dense data we need to adjust the position of elements on the plot, otherwise data points might obscure one another. Bar plots frequently **stack** or **dodge** the bars to avoid overlap: ```{r position_dodge} -count(x = mpg, class, cyl) %>% - ggplot(mapping = aes(x = cyl, y = n, fill = class)) + +count(x = penguins, species, island) %>% + ggplot(mapping = aes(x = island, y = n, fill = species)) + geom_bar(stat = "identity") + ggtitle(label = "A stacked bar chart") -count(x = mpg, class, cyl) %>% - ggplot(mapping = aes(x = cyl, y = n, fill = class)) + +count(x = penguins, species, island) %>% + ggplot(mapping = aes(x = island, y = n, fill = species)) + geom_bar(stat = "identity", position = "dodge") + ggtitle(label = "A dodged bar chart") ``` @@ -172,11 +173,11 @@ count(x = mpg, class, cyl) %>% Sometimes scatterplots with few unique $x$ and $y$ values are **jittered** (random noise is added) to reduce overplotting. ```{r position} -ggplot(data = mpg, mapping = aes(x = cyl, y = hwy)) + +ggplot(data = penguins, mapping = aes(x = island, y = body_mass_g)) + geom_point() + ggtitle("A point geom with obscured data points") -ggplot(data = mpg, mapping = aes(x = cyl, y = hwy)) + +ggplot(data = penguins, mapping = aes(x = island, y = body_mass_g)) + geom_jitter() + ggtitle("A point geom with jittered data points") ``` @@ -186,21 +187,21 @@ ggplot(data = mpg, mapping = aes(x = cyl, y = hwy)) + A **scale** controls how data is mapped to aesthetic attributes, so we need one scale for every aesthetic property employed in a layer. For example, this graph defines a scale for color: ```{r scale_color} -ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point() + guides(color = guide_legend(override.aes = list(size = 4))) ``` -Note that the scale is consistent - every point for a compact car is drawn in tan, whereas SUVs are drawn in pink. The scale can be changed to use a different color palette: +Note that the scale is consistent - every point for an Adèlie penguin is drawn in red, whereas Chinstrap penguins are drawn in green The scale can be changed to use a different color palette: ```{r scale_color_palette} -ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point() + guides(color = guide_legend(override.aes = list(size = 4))) + scale_color_brewer(palette = "Dark2") ``` -Now we are using a different palette, but the scale is still consistent: all compact cars utilize the same color, whereas SUVs use a new color **but each SUV still uses the same, consistent color**. +Now we are using a different palette, but the scale is still consistent: all Adèlie penguins utilize the same color, whereas Chinstrap penguins use a new color **but each Chinstrap penguin still uses the same, consistent color**. ## Coordinate system @@ -236,21 +237,21 @@ p + **Faceting** can be used to split the data up into subsets of the entire dataset. This is a powerful tool when investigating whether patterns are the same or different across conditions, and allows the subsets to be visualized on the same plot (known as **conditioned** or **trellis** plots). The faceting specification describes which variables should be used to split up the data, and how they should be arranged. ```{r facet} -ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point() + - facet_wrap(facets = vars(class)) + facet_wrap(facets = vars(species)) ``` ## Defaults Rather than explicitly declaring each component of a layered graphic (which will use more code and introduces opportunities for errors), we can establish intelligent defaults for specific geoms and scales. For instance, whenever we want to use a bar geom, we can default to using a stat that counts the number of observations in each group of our variable in the $x$ position. -Consider the following scenario: you wish to generate a scatterplot visualizing the relationship between engine displacement size and highway fuel efficiency. With no defaults, the code to generate this graph is: +Consider the following scenario: you wish to generate a scatterplot visualizing the relationship between flipper length and body mass. With no defaults, the code to generate this graph is: ```{r default} ggplot() + layer( - data = mpg, mapping = aes(x = displ, y = hwy), + data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g), geom = "point", stat = "identity", position = "identity" ) + scale_x_continuous() + @@ -262,8 +263,8 @@ The above code: * Creates a new plot object (`ggplot`) * Adds a layer (`layer`) - * Specifies the data (`mpg`) - * Maps engine displacement to the $x$ position and highway mileage to the $y$ position (`mapping`) + * Specifies the data (`penguins`) + * Maps flipper length to the $x$ position and body mass to the $y$ position (`mapping`) * Uses the point geometric transformation (`geom = "point"`) * Implements an identity transformation and position (`stat = "identity"` and `position = "identity"`) * Establishes two continuous position scales (`scale_x_continuous` and `scale_y_continuous`) @@ -282,36 +283,36 @@ Using these defaults, we can rewrite the above code as: ```{r default2} ggplot() + - geom_point(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) ``` This generates the exact same plot, but uses fewer lines of code. Because multiple layers can use the same components (data, mapping, etc.), we can also specify that information in the `ggplot()` function rather than in the `layer()` function: ```{r default3} -ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point() ``` And as we will learn, function arguments in R use specific ordering, so we can omit the explicit call to `data` and `mapping`: ```{r default4} -ggplot(mpg, aes(displ, hwy)) + +ggplot(penguins, aes(flipper_length_mm, body_mass_g)) + geom_point() ``` With this specification, it is easy to build the graphic up with additional layers, without modifying the original code: ```{r default5} -ggplot(mpg, aes(displ, hwy)) + +ggplot(penguins, aes(flipper_length_mm, body_mass_g)) + geom_point() + geom_smooth() ``` -Because we called `aes(displ, hwy)` within the `ggplot()` function, it is automatically passed along to both `geom_point()` and `geom_smooth()`. If we fail to do this, we get an error: +Because we called `aes(flipper_length_mm, body_mass_g)` within the `ggplot()` function, it is automatically passed along to both `geom_point()` and `geom_smooth()`. If we fail to do this, we get an error: ```{r default6, error = TRUE} -ggplot(mpg) + - geom_point(aes(displ, hwy)) + +ggplot(penguins) + + geom_point(aes(flipper_length_mm, body_mass_g)) + geom_smooth() ``` diff --git a/content/notes/grammar-of-graphics/index.md b/content/notes/grammar-of-graphics/index.md index 1719da17..7eee6747 100644 --- a/content/notes/grammar-of-graphics/index.md +++ b/content/notes/grammar-of-graphics/index.md @@ -26,6 +26,7 @@ This page is a summary of [*A Layered Grammar of Graphics*](http://www-tandfonli ```r library(tidyverse) library(knitr) +library(palmerpenguins) ``` Google defines a **grammar** as "the whole system and structure of a language or of languages in general, usually taken as consisting of syntax and morphology (including inflections) and sometimes also phonology and semantics".^[[Google](https://www.google.com/search?q=grammar)] Others consider a grammar to be "the fundamental principles or rules of an art or science".^[[Wickham, Hadley. (2010) "A Layered Grammar of Graphics". *Journal of Computational and Graphical Statistics*, 19(1).](http://www.jstor.org.proxy.uchicago.edu/stable/25651297)] Applied to visualizations, a **grammar of graphics** is a grammar used to describe and create a wide range of statistical graphics.^[[Wilkinson, Leland. (2005). *The Grammar of Graphics*. (UChicago authentication required)](http://link.springer.com.proxy.uchicago.edu/book/10.1007%2F0-387-28695-0)] @@ -59,22 +60,17 @@ The **layered grammar of graphics** approach is implemented in [`ggplot2`](https Layers are typically related to one another and share many common features. For instance, multiple layers can be built using the same underlying data. An example would be a scatterplot overlayed with a smoothed regression line to summarize the relationship between two variables: - -``` -## `geom_smooth()` using formula 'y ~ x' -``` - ## Data and mapping **Data** defines the source of the information to be visualized, but is independent from the other elements. So a layered graphic can be built which utilizes different data sources while keeping the other components the same. -For our running example, let's use the `mpg` dataset in the `ggplot2` package.^[Run `?mpg` in the console to get more information about this dataset.] +For our running example, let's use the `penguins` dataset in the [`palmerpenguins`](https://allisonhorst.github.io/palmerpenguins/) package.^[Run `?penguins` in the console to get more information about this dataset.] ```r -head(x = mpg) %>% +head(x = penguins) %>% kable(caption = "Dataset of automobiles") ``` @@ -82,42 +78,42 @@ head(x = mpg) %>% Table: Table 1: Dataset of automobiles -|manufacturer |model | displ| year| cyl|trans |drv | cty| hwy|fl |class | -|:------------|:-----|-----:|----:|---:|:----------|:---|---:|---:|:--|:-------| -|audi |a4 | 1.8| 1999| 4|auto(l5) |f | 18| 29|p |compact | -|audi |a4 | 1.8| 1999| 4|manual(m5) |f | 21| 29|p |compact | -|audi |a4 | 2.0| 2008| 4|manual(m6) |f | 20| 31|p |compact | -|audi |a4 | 2.0| 2008| 4|auto(av) |f | 21| 30|p |compact | -|audi |a4 | 2.8| 1999| 6|auto(l5) |f | 16| 26|p |compact | -|audi |a4 | 2.8| 1999| 6|manual(m5) |f | 18| 26|p |compact | +|species |island | bill_length_mm| bill_depth_mm| flipper_length_mm| body_mass_g|sex | year| +|:-------|:---------|--------------:|-------------:|-----------------:|-----------:|:------|----:| +|Adelie |Torgersen | 39.1| 18.7| 181| 3750|male | 2007| +|Adelie |Torgersen | 39.5| 17.4| 186| 3800|female | 2007| +|Adelie |Torgersen | 40.3| 18.0| 195| 3250|female | 2007| +|Adelie |Torgersen | NA| NA| NA| NA|NA | 2007| +|Adelie |Torgersen | 36.7| 19.3| 193| 3450|female | 2007| +|Adelie |Torgersen | 39.3| 20.6| 190| 3650|male | 2007| -**Mapping** defines how the variables are applied to the plot. So if we were graphing information from `mpg`, we might map a car's engine displacement to the $x$ position and highway mileage to the $y$ position. +**Mapping** defines how the variables are applied to the plot. So if we were graphing information from `penguins`, we might map a penguin's flipper length to the $x$ position and body mass to the $y$ position. ```r -mpg %>% - select(displ, hwy) %>% +penguins %>% + select(flipper_length_mm, body_mass_g) %>% rename( - x = displ, - y = hwy + x = flipper_length_mm, + y = body_mass_g ) ``` ``` -## # A tibble: 234 x 2 +## # A tibble: 344 × 2 ## x y -## -## 1 1.8 29 -## 2 1.8 29 -## 3 2 31 -## 4 2 30 -## 5 2.8 26 -## 6 2.8 26 -## 7 3.1 27 -## 8 1.8 26 -## 9 1.8 25 -## 10 2 28 -## # … with 224 more rows +## +## 1 181 3750 +## 2 186 3800 +## 3 195 3250 +## 4 NA NA +## 5 193 3450 +## 6 190 3650 +## 7 181 3625 +## 8 195 4675 +## 9 193 3475 +## 10 190 4250 +## # … with 334 more rows ``` ## Statistical transformation @@ -128,43 +124,42 @@ A stat is a function that takes in a dataset as the input and returns a dataset ```r -mpg %>% - select(cyl) +penguins %>% + select(island) ``` ``` -## # A tibble: 234 x 1 -## cyl -## -## 1 4 -## 2 4 -## 3 4 -## 4 4 -## 5 6 -## 6 6 -## 7 6 -## 8 4 -## 9 4 -## 10 4 -## # … with 224 more rows +## # A tibble: 344 × 1 +## island +## +## 1 Torgersen +## 2 Torgersen +## 3 Torgersen +## 4 Torgersen +## 5 Torgersen +## 6 Torgersen +## 7 Torgersen +## 8 Torgersen +## 9 Torgersen +## 10 Torgersen +## # … with 334 more rows ``` You would transform it to: ```r -mpg %>% - count(cyl) +penguins %>% + count(island) ``` ``` -## # A tibble: 4 x 2 -## cyl n -## -## 1 4 81 -## 2 5 4 -## 3 6 79 -## 4 8 70 +## # A tibble: 3 × 2 +## island n +## +## 1 Biscoe 168 +## 2 Dream 124 +## 3 Torgersen 52 ``` {{% callout note %}} @@ -185,7 +180,7 @@ Each geom can only display certain **aesthetics** or visual attributes of the ge ```r -ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point() + ggtitle("A point geom with position and color aesthetics") ``` @@ -193,13 +188,13 @@ ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + * Position defines where each point is drawn on the plot -* Color defines the color of each point. Here the color is determined by the class of the car (observation) +* Color defines the color of each point. Here the color is determined by the species of the car (observation) Whereas a bar geom has position, height, width, and fill color. ```r -ggplot(data = mpg, aes(x = cyl)) + +ggplot(data = penguins, aes(x = island)) + geom_bar() + ggtitle("A bar geom with position and height aesthetics") ``` @@ -207,7 +202,7 @@ ggplot(data = mpg, aes(x = cyl)) + * Position determines the starting location (origin) of each bar -* Height determines how tall to draw the bar. Here the height is based on the number of observations in the dataset for each possible number of cylinders. +* Height determines how tall to draw the bar. Here the height is based on the number of observations in the dataset for each island. ## Position adjustment @@ -215,8 +210,8 @@ Sometimes with dense data we need to adjust the position of elements on the plot ```r -count(x = mpg, class, cyl) %>% - ggplot(mapping = aes(x = cyl, y = n, fill = class)) + +count(x = penguins, species, island) %>% + ggplot(mapping = aes(x = island, y = n, fill = species)) + geom_bar(stat = "identity") + ggtitle(label = "A stacked bar chart") ``` @@ -224,8 +219,8 @@ count(x = mpg, class, cyl) %>% ```r -count(x = mpg, class, cyl) %>% - ggplot(mapping = aes(x = cyl, y = n, fill = class)) + +count(x = penguins, species, island) %>% + ggplot(mapping = aes(x = island, y = n, fill = species)) + geom_bar(stat = "identity", position = "dodge") + ggtitle(label = "A dodged bar chart") ``` @@ -236,7 +231,7 @@ Sometimes scatterplots with few unique $x$ and $y$ values are **jittered** (rand ```r -ggplot(data = mpg, mapping = aes(x = cyl, y = hwy)) + +ggplot(data = penguins, mapping = aes(x = island, y = body_mass_g)) + geom_point() + ggtitle("A point geom with obscured data points") ``` @@ -244,7 +239,7 @@ ggplot(data = mpg, mapping = aes(x = cyl, y = hwy)) + ```r -ggplot(data = mpg, mapping = aes(x = cyl, y = hwy)) + +ggplot(data = penguins, mapping = aes(x = island, y = body_mass_g)) + geom_jitter() + ggtitle("A point geom with jittered data points") ``` @@ -257,18 +252,18 @@ A **scale** controls how data is mapped to aesthetic attributes, so we need one ```r -ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point() + guides(color = guide_legend(override.aes = list(size = 4))) ``` -Note that the scale is consistent - every point for a compact car is drawn in tan, whereas SUVs are drawn in pink. The scale can be changed to use a different color palette: +Note that the scale is consistent - every point for an Adèlie penguin is drawn in red, whereas Chinstrap penguins are drawn in green The scale can be changed to use a different color palette: ```r -ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point() + guides(color = guide_legend(override.aes = list(size = 4))) + scale_color_brewer(palette = "Dark2") @@ -276,7 +271,7 @@ ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + -Now we are using a different palette, but the scale is still consistent: all compact cars utilize the same color, whereas SUVs use a new color **but each SUV still uses the same, consistent color**. +Now we are using a different palette, but the scale is still consistent: all Adèlie penguins utilize the same color, whereas Chinstrap penguins use a new color **but each Chinstrap penguin still uses the same, consistent color**. ## Coordinate system @@ -322,9 +317,9 @@ p + ```r -ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point() + - facet_wrap(facets = vars(class)) + facet_wrap(facets = vars(species)) ``` @@ -333,13 +328,13 @@ ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + Rather than explicitly declaring each component of a layered graphic (which will use more code and introduces opportunities for errors), we can establish intelligent defaults for specific geoms and scales. For instance, whenever we want to use a bar geom, we can default to using a stat that counts the number of observations in each group of our variable in the $x$ position. -Consider the following scenario: you wish to generate a scatterplot visualizing the relationship between engine displacement size and highway fuel efficiency. With no defaults, the code to generate this graph is: +Consider the following scenario: you wish to generate a scatterplot visualizing the relationship between flipper length and body mass. With no defaults, the code to generate this graph is: ```r ggplot() + layer( - data = mpg, mapping = aes(x = displ, y = hwy), + data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g), geom = "point", stat = "identity", position = "identity" ) + scale_x_continuous() + @@ -353,8 +348,8 @@ The above code: * Creates a new plot object (`ggplot`) * Adds a layer (`layer`) - * Specifies the data (`mpg`) - * Maps engine displacement to the $x$ position and highway mileage to the $y$ position (`mapping`) + * Specifies the data (`penguins`) + * Maps flipper length to the $x$ position and body mass to the $y$ position (`mapping`) * Uses the point geometric transformation (`geom = "point"`) * Implements an identity transformation and position (`stat = "identity"` and `position = "identity"`) * Establishes two continuous position scales (`scale_x_continuous` and `scale_y_continuous`) @@ -374,7 +369,7 @@ Using these defaults, we can rewrite the above code as: ```r ggplot() + - geom_point(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_point(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) ``` @@ -383,7 +378,7 @@ This generates the exact same plot, but uses fewer lines of code. Because multip ```r -ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + +ggplot(data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point() ``` @@ -393,7 +388,7 @@ And as we will learn, function arguments in R use specific ordering, so we can o ```r -ggplot(mpg, aes(displ, hwy)) + +ggplot(penguins, aes(flipper_length_mm, body_mass_g)) + geom_point() ``` @@ -403,30 +398,22 @@ With this specification, it is easy to build the graphic up with additional laye ```r -ggplot(mpg, aes(displ, hwy)) + +ggplot(penguins, aes(flipper_length_mm, body_mass_g)) + geom_point() + geom_smooth() ``` -``` -## `geom_smooth()` using method = 'loess' and formula 'y ~ x' -``` - -Because we called `aes(displ, hwy)` within the `ggplot()` function, it is automatically passed along to both `geom_point()` and `geom_smooth()`. If we fail to do this, we get an error: +Because we called `aes(flipper_length_mm, body_mass_g)` within the `ggplot()` function, it is automatically passed along to both `geom_point()` and `geom_smooth()`. If we fail to do this, we get an error: ```r -ggplot(mpg) + - geom_point(aes(displ, hwy)) + +ggplot(penguins) + + geom_point(aes(flipper_length_mm, body_mass_g)) + geom_smooth() ``` -``` -## `geom_smooth()` using method = 'loess' and formula 'y ~ x' -``` - ``` ## Error: stat_smooth requires the following missing aesthetics: x and y ``` @@ -452,93 +439,99 @@ devtools::session_info() ## collate en_US.UTF-8 ## ctype en_US.UTF-8 ## tz America/Chicago -## date 2021-09-01 +## date 2022-01-06 ## ## ─ Packages ─────────────────────────────────────────────────────────────────── -## package * version date lib source -## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) -## backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) -## blogdown 1.4 2021-07-23 [1] CRAN (R 4.1.0) -## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.0) -## broom 0.7.9 2021-07-27 [1] CRAN (R 4.1.0) -## bslib 0.2.5.1 2021-05-18 [1] CRAN (R 4.1.0) -## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) -## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0) -## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0) -## cli 3.0.1 2021-07-17 [1] CRAN (R 4.1.0) -## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.0) -## colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) -## crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.0) -## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) -## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) -## desc 1.3.0 2021-03-05 [1] CRAN (R 4.1.0) -## devtools 2.4.2 2021-06-07 [1] CRAN (R 4.1.0) -## digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0) -## dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) -## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) -## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) -## fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) -## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0) -## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) -## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0) -## fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) -## generics 0.1.0 2020-10-31 [1] CRAN (R 4.1.0) -## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) -## glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0) -## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) -## haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.0) -## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.0) -## highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) -## hms 1.1.0 2021-05-17 [1] CRAN (R 4.1.0) -## htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0) -## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) -## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0) -## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) -## knitr * 1.33 2021-04-24 [1] CRAN (R 4.1.0) -## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.0) -## lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.1.0) -## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0) -## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) -## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) -## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0) -## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) -## pillar 1.6.2 2021-07-29 [1] CRAN (R 4.1.0) -## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.1.0) -## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) -## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.1.0) -## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) -## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) -## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) -## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) -## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) -## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) -## readr * 2.0.1 2021-08-10 [1] CRAN (R 4.1.0) -## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0) -## remotes 2.4.0 2021-06-02 [1] CRAN (R 4.1.0) -## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) -## rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0) -## rmarkdown 2.10 2021-08-06 [1] CRAN (R 4.1.0) -## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0) -## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) -## rvest 1.0.1 2021-07-26 [1] CRAN (R 4.1.0) -## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0) -## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) -## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) -## stringi 1.7.3 2021-07-16 [1] CRAN (R 4.1.0) -## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) -## testthat 3.0.4 2021-07-01 [1] CRAN (R 4.1.0) -## tibble * 3.1.3 2021-07-23 [1] CRAN (R 4.1.0) -## tidyr * 1.1.3 2021-03-03 [1] CRAN (R 4.1.0) -## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) -## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.0) -## tzdb 0.1.2 2021-07-20 [1] CRAN (R 4.1.0) -## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.1.0) -## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) -## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) -## withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) -## xfun 0.25 2021-08-06 [1] CRAN (R 4.1.0) -## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) -## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) +## package * version date lib source +## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) +## backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) +## blogdown 1.7 2021-12-19 [1] CRAN (R 4.1.0) +## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.0) +## broom 0.7.9 2021-07-27 [1] CRAN (R 4.1.0) +## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.0) +## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) +## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0) +## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0) +## cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.0) +## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.0) +## colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) +## crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0) +## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) +## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) +## desc 1.3.0 2021-03-05 [1] CRAN (R 4.1.0) +## devtools 2.4.2 2021-06-07 [1] CRAN (R 4.1.0) +## digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.0) +## dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) +## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) +## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) +## fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) +## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0) +## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) +## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0) +## fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) +## generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0) +## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) +## glue 1.5.0 2021-11-07 [1] CRAN (R 4.1.0) +## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) +## haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.0) +## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.0) +## highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) +## hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0) +## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0) +## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) +## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0) +## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) +## knitr * 1.33 2021-04-24 [1] CRAN (R 4.1.0) +## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.0) +## lattice 0.20-44 2021-05-02 [1] CRAN (R 4.1.0) +## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0) +## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0) +## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) +## Matrix 1.3-4 2021-06-01 [1] CRAN (R 4.1.0) +## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) +## mgcv 1.8-36 2021-06-01 [1] CRAN (R 4.1.0) +## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0) +## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) +## nlme 3.1-152 2021-02-04 [1] CRAN (R 4.1.0) +## palmerpenguins * 0.1.0 2020-07-23 [1] CRAN (R 4.1.0) +## pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0) +## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.1.0) +## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) +## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.1.0) +## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) +## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) +## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) +## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) +## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) +## RColorBrewer 1.1-2 2014-12-07 [1] CRAN (R 4.1.0) +## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) +## readr * 2.0.2 2021-09-27 [1] CRAN (R 4.1.0) +## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0) +## remotes 2.4.0 2021-06-02 [1] CRAN (R 4.1.0) +## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) +## rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0) +## rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0) +## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0) +## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) +## rvest 1.0.1 2021-07-26 [1] CRAN (R 4.1.0) +## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0) +## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) +## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) +## stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.0) +## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) +## testthat 3.0.4 2021-07-01 [1] CRAN (R 4.1.0) +## tibble * 3.1.6 2021-11-07 [1] CRAN (R 4.1.0) +## tidyr * 1.1.4 2021-09-27 [1] CRAN (R 4.1.0) +## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) +## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.0) +## tzdb 0.1.2 2021-07-20 [1] CRAN (R 4.1.0) +## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.1.0) +## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) +## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) +## withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) +## xfun 0.29 2021-12-14 [1] CRAN (R 4.1.0) +## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) +## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) ## ## [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library ``` diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/coord_cart-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/coord_cart-1.png index 5e9225c2..1069f3d0 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/coord_cart-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/coord_cart-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/coord_polar-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/coord_polar-1.png index f24cc138..6644d70c 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/coord_polar-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/coord_polar-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/coord_semi_log-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/coord_semi_log-1.png index 1927ef0d..39221b89 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/coord_semi_log-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/coord_semi_log-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/default-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/default-1.png index 05faa7f0..6f355c94 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/default-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/default-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/default2-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/default2-1.png index 05faa7f0..6f355c94 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/default2-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/default2-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/default3-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/default3-1.png index 05faa7f0..6f355c94 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/default3-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/default3-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/default4-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/default4-1.png index 05faa7f0..6f355c94 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/default4-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/default4-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/default5-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/default5-1.png index 8c353f03..a85ec13f 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/default5-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/default5-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/default6-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/default6-1.png index 0080d22c..17881881 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/default6-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/default6-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/facet-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/facet-1.png index bb954dec..d0e92dab 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/facet-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/facet-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/geom_bar-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/geom_bar-1.png index 560f2de9..3333c81c 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/geom_bar-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/geom_bar-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/geom_point-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/geom_point-1.png index 5124f4b1..8e7efa15 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/geom_point-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/geom_point-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/position-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/position-1.png index 0f89c7c0..0616de89 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/position-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/position-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/position-2.png b/content/notes/grammar-of-graphics/index_files/figure-html/position-2.png index 6dc91e0d..aa3be4cc 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/position-2.png and b/content/notes/grammar-of-graphics/index_files/figure-html/position-2.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/position_dodge-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/position_dodge-1.png index 0e48e26c..c4cf5aee 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/position_dodge-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/position_dodge-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/position_dodge-2.png b/content/notes/grammar-of-graphics/index_files/figure-html/position_dodge-2.png index 2db8accd..8b882866 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/position_dodge-2.png and b/content/notes/grammar-of-graphics/index_files/figure-html/position_dodge-2.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/scale_color-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/scale_color-1.png index 74f07a9f..d8a16902 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/scale_color-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/scale_color-1.png differ diff --git a/content/notes/grammar-of-graphics/index_files/figure-html/scale_color_palette-1.png b/content/notes/grammar-of-graphics/index_files/figure-html/scale_color_palette-1.png index 1eda362a..7e736192 100644 Binary files a/content/notes/grammar-of-graphics/index_files/figure-html/scale_color_palette-1.png and b/content/notes/grammar-of-graphics/index_files/figure-html/scale_color_palette-1.png differ diff --git a/content/notes/intro-to-course/index.Rmd b/content/notes/intro-to-course/index.Rmd index 3ce44331..63e97ec5 100644 --- a/content/notes/intro-to-course/index.Rmd +++ b/content/notes/intro-to-course/index.Rmd @@ -62,17 +62,18 @@ print("Hello world") One line of code, and it performs a very specific task (print the phrase "Hello world" to the screen). -More typically, your programs will perform statistical and graphical analysis on data of a variety of forms. For example, here I analyze a dataset of automobiles to assess the relationship between engine displacement and highway mileage: +More typically, your programs will perform statistical and graphical analysis on data of a variety of forms. For example, here I analyze a dataset of [adult foraging penguins](https://allisonhorst.github.io/palmerpenguins/) to assess the relationship between flipper length and body mass: -```{r auto-example} +```{r penguins-example, warning = FALSE} # load packages library(tidyverse) +library(palmerpenguins) library(broom) # estimate and print the linear model -lm(hwy ~ displ, data = mpg) %>% +lm(body_mass_g ~ flipper_length_mm, data = penguins) %>% tidy() %>% - mutate(term = c("Intercept", "Engine displacement (in liters)")) %>% + mutate(term = c("Intercept", "Flipper length (millimeters)")) %>% knitr::kable( digits = 2, col.names = c( @@ -82,13 +83,13 @@ lm(hwy ~ displ, data = mpg) %>% ) # visualize the relationship -ggplot(data = mpg, aes(displ, hwy)) + - geom_point(aes(color = class)) + +ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g)) + + geom_point(mapping = aes(color = species)) + geom_smooth(method = "lm", se = FALSE, color = "black", alpha = .25) + labs( - x = "Engine displacement (in liters)", - y = "Highway miles per gallon", - color = "Car type" + x = "Flipper length (in millimeters)", + y = "Body mass (in grams)", + color = "Species" ) + theme_bw(base_size = 16) ``` diff --git a/content/notes/intro-to-course/index.md b/content/notes/intro-to-course/index.md index 2f0f8a5e..1e9ca8a0 100644 --- a/content/notes/intro-to-course/index.md +++ b/content/notes/intro-to-course/index.md @@ -65,18 +65,19 @@ print("Hello world") One line of code, and it performs a very specific task (print the phrase "Hello world" to the screen). -More typically, your programs will perform statistical and graphical analysis on data of a variety of forms. For example, here I analyze a dataset of automobiles to assess the relationship between engine displacement and highway mileage: +More typically, your programs will perform statistical and graphical analysis on data of a variety of forms. For example, here I analyze a dataset of [adult foraging penguins](https://allisonhorst.github.io/palmerpenguins/) to assess the relationship between flipper length and body mass: ```r # load packages library(tidyverse) +library(palmerpenguins) library(broom) # estimate and print the linear model -lm(hwy ~ displ, data = mpg) %>% +lm(body_mass_g ~ flipper_length_mm, data = penguins) %>% tidy() %>% - mutate(term = c("Intercept", "Engine displacement (in liters)")) %>% + mutate(term = c("Intercept", "Flipper length (millimeters)")) %>% knitr::kable( digits = 2, col.names = c( @@ -88,20 +89,20 @@ lm(hwy ~ displ, data = mpg) %>% -|Variable | Estimate| Standard Error| T-statistic| P-Value| -|:-------------------------------|--------:|--------------:|-----------:|-------:| -|Intercept | 35.70| 0.72| 49.55| 0| -|Engine displacement (in liters) | -3.53| 0.19| -18.15| 0| +|Variable | Estimate| Standard Error| T-statistic| P-Value| +|:----------------------------|--------:|--------------:|-----------:|-------:| +|Intercept | -5780.83| 305.81| -18.90| 0| +|Flipper length (millimeters) | 49.69| 1.52| 32.72| 0| ```r # visualize the relationship -ggplot(data = mpg, aes(displ, hwy)) + - geom_point(aes(color = class)) + +ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g)) + + geom_point(mapping = aes(color = species)) + geom_smooth(method = "lm", se = FALSE, color = "black", alpha = .25) + labs( - x = "Engine displacement (in liters)", - y = "Highway miles per gallon", - color = "Car type" + x = "Flipper length (in millimeters)", + y = "Body mass (in grams)", + color = "Species" ) + theme_bw(base_size = 16) ``` @@ -110,7 +111,7 @@ ggplot(data = mpg, aes(displ, hwy)) + ## `geom_smooth()` using formula 'y ~ x' ``` - + But we will start small to build our way up to there. @@ -380,97 +381,102 @@ devtools::session_info() ## collate en_US.UTF-8 ## ctype en_US.UTF-8 ## tz America/Chicago -## date 2021-09-26 +## date 2022-01-06 ## ## ─ Packages ─────────────────────────────────────────────────────────────────── -## package * version date lib source -## askpass 1.1 2019-01-13 [1] CRAN (R 4.1.0) -## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) -## backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) -## blogdown 1.4 2021-07-23 [1] CRAN (R 4.1.0) -## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.0) -## broom * 0.7.9 2021-07-27 [1] CRAN (R 4.1.0) -## bslib 0.2.5.1 2021-05-18 [1] CRAN (R 4.1.0) -## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) -## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0) -## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0) -## cli 3.0.1 2021-07-17 [1] CRAN (R 4.1.0) -## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.0) -## colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) -## crayon 1.4.1 2021-02-08 [1] CRAN (R 4.1.0) -## curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.0) -## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) -## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) -## desc 1.3.0 2021-03-05 [1] CRAN (R 4.1.0) -## devtools 2.4.2 2021-06-07 [1] CRAN (R 4.1.0) -## digest 0.6.27 2020-10-24 [1] CRAN (R 4.1.0) -## dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) -## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) -## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) -## fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) -## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0) -## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) -## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0) -## fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) -## generics 0.1.0 2020-10-31 [1] CRAN (R 4.1.0) -## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) -## glue 1.4.2 2020-08-27 [1] CRAN (R 4.1.0) -## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) -## haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.0) -## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.0) -## highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) -## hms 1.1.0 2021-05-17 [1] CRAN (R 4.1.0) -## htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0) -## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) -## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0) -## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) -## knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) -## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.0) -## lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.1.0) -## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0) -## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) -## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) -## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0) -## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) -## openssl 1.4.4 2021-04-30 [1] CRAN (R 4.1.0) -## pillar 1.6.2 2021-07-29 [1] CRAN (R 4.1.0) -## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.1.0) -## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) -## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.1.0) -## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) -## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) -## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) -## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) -## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) -## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) -## readr * 2.0.1 2021-08-10 [1] CRAN (R 4.1.0) -## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0) -## remotes 2.4.0 2021-06-02 [1] CRAN (R 4.1.0) -## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) -## rlang 0.4.11 2021-04-30 [1] CRAN (R 4.1.0) -## rmarkdown 2.10 2021-08-06 [1] CRAN (R 4.1.0) -## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0) -## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) -## rtweet * 0.7.0 2020-01-08 [1] CRAN (R 4.1.0) -## rvest 1.0.1 2021-07-26 [1] CRAN (R 4.1.0) -## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0) -## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) -## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) -## stringi 1.7.3 2021-07-16 [1] CRAN (R 4.1.0) -## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) -## testthat 3.0.4 2021-07-01 [1] CRAN (R 4.1.0) -## tibble * 3.1.3 2021-07-23 [1] CRAN (R 4.1.0) -## tidyr * 1.1.3 2021-03-03 [1] CRAN (R 4.1.0) -## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) -## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.0) -## tzdb 0.1.2 2021-07-20 [1] CRAN (R 4.1.0) -## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.1.0) -## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) -## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) -## withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) -## xfun 0.25 2021-08-06 [1] CRAN (R 4.1.0) -## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) -## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) +## package * version date lib source +## askpass 1.1 2019-01-13 [1] CRAN (R 4.1.0) +## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) +## backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) +## blogdown 1.7 2021-12-19 [1] CRAN (R 4.1.0) +## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.0) +## broom * 0.7.9 2021-07-27 [1] CRAN (R 4.1.0) +## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.0) +## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) +## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0) +## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0) +## cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.0) +## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.0) +## colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) +## crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0) +## curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.0) +## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) +## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) +## desc 1.3.0 2021-03-05 [1] CRAN (R 4.1.0) +## devtools 2.4.2 2021-06-07 [1] CRAN (R 4.1.0) +## digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.0) +## dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) +## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) +## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) +## fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) +## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0) +## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) +## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0) +## fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) +## generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0) +## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) +## glue 1.5.0 2021-11-07 [1] CRAN (R 4.1.0) +## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) +## haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.0) +## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.0) +## highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) +## hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0) +## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0) +## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) +## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0) +## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) +## knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) +## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.0) +## lattice 0.20-44 2021-05-02 [1] CRAN (R 4.1.0) +## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0) +## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0) +## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) +## Matrix 1.3-4 2021-06-01 [1] CRAN (R 4.1.0) +## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) +## mgcv 1.8-36 2021-06-01 [1] CRAN (R 4.1.0) +## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0) +## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) +## nlme 3.1-152 2021-02-04 [1] CRAN (R 4.1.0) +## openssl 1.4.4 2021-04-30 [1] CRAN (R 4.1.0) +## palmerpenguins * 0.1.0 2020-07-23 [1] CRAN (R 4.1.0) +## pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0) +## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.1.0) +## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) +## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.1.0) +## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) +## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) +## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) +## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) +## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) +## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) +## readr * 2.0.2 2021-09-27 [1] CRAN (R 4.1.0) +## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0) +## remotes 2.4.0 2021-06-02 [1] CRAN (R 4.1.0) +## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) +## rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0) +## rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0) +## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0) +## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) +## rtweet * 0.7.0 2020-01-08 [1] CRAN (R 4.1.0) +## rvest 1.0.1 2021-07-26 [1] CRAN (R 4.1.0) +## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0) +## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) +## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) +## stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.0) +## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) +## testthat 3.0.4 2021-07-01 [1] CRAN (R 4.1.0) +## tibble * 3.1.6 2021-11-07 [1] CRAN (R 4.1.0) +## tidyr * 1.1.4 2021-09-27 [1] CRAN (R 4.1.0) +## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) +## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.0) +## tzdb 0.1.2 2021-07-20 [1] CRAN (R 4.1.0) +## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.1.0) +## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) +## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) +## withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) +## xfun 0.29 2021-12-14 [1] CRAN (R 4.1.0) +## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) +## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) ## ## [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library ``` diff --git a/content/notes/intro-to-course/index_files/figure-html/penguins-example-1.png b/content/notes/intro-to-course/index_files/figure-html/penguins-example-1.png new file mode 100644 index 00000000..2ebfe280 Binary files /dev/null and b/content/notes/intro-to-course/index_files/figure-html/penguins-example-1.png differ diff --git a/content/notes/intro-to-course/index_files/figure-html/sesame-good-1.png b/content/notes/intro-to-course/index_files/figure-html/sesame-good-1.png index f67f464c..5475337e 100644 Binary files a/content/notes/intro-to-course/index_files/figure-html/sesame-good-1.png and b/content/notes/intro-to-course/index_files/figure-html/sesame-good-1.png differ diff --git a/content/notes/iteration/index.Rmd b/content/notes/iteration/index.Rmd index 9c8873f0..31ab22fd 100644 --- a/content/notes/iteration/index.Rmd +++ b/content/notes/iteration/index.Rmd @@ -21,6 +21,8 @@ knitr::opts_chunk$set(cache = TRUE) ```{r packages, cache = FALSE, message = FALSE} library(tidyverse) library(rcfss) +library(palmerpenguins) + set.seed(1234) theme_set(theme_minimal()) ``` @@ -106,24 +108,24 @@ This is the code that actually performs the desired calculations. It runs multip If you don't preallocate space for the output, each time the `for` loop iterates, it makes a copy of the output and appends the new value at the end. Copying data takes time and memory. If the output is preallocated space, the loop simply fills in the slots with the correct values. -Consider the following task: duplicate the data frame `mpg` 100 times and bind them together into a single data frame. We can accomplish the latter task using `bind_rows()`, and use a `for` loop to create 100 copies of `mpg`. What is the difference if we preallocate space for the output as opposed to just copying and extending the data frame each time? +Consider the following task: duplicate the data frame `palmerpenguins::penguins` 100 times and bind them together into a single data frame. We can accomplish the latter task using `bind_rows()`, and use a `for` loop to create 100 copies of `penguins`. What is the difference if we preallocate space for the output as opposed to just copying and extending the data frame each time? ```r # no preallocation -mpg_no_preall <- tibble() +penguins_no_preall <- tibble() for(i in 1:100){ - mpg_no_preall <- bind_rows(mpg_no_preall, mpg) + penguins_no_preall <- bind_rows(penguins_no_preall, penguins) } # with preallocation using a list -mpg_preall <- vector(mode = "list", length = 100) +penguins_preall <- vector(mode = "list", length = 100) for(i in 1:100){ - mpg_preall[[i]] <- mpg + penguins_preall[[i]] <- penguins } -mpg_preall <- bind_rows(mpg_preall) +penguins_preall <- bind_rows(penguins_preall) ``` Let's compare the time it takes to complete each of these loops by replicating each example 100 times and measuring how long it takes for the expression to evaluate. @@ -131,30 +133,30 @@ Let's compare the time it takes to complete each of these loops by replicating e ```{r preallocate, echo = FALSE, message = FALSE} library(microbenchmark) -# bind together 100 copies of mpg +# bind together 100 copies of penguins times <- microbenchmark( `No preallocation` = { - mpg_no_preall <- tibble() + penguins_no_preall <- tibble() for (i in 1:100) { - mpg_no_preall <- bind_rows(mpg_no_preall, mpg) + penguins_no_preall <- bind_rows(penguins_no_preall, penguins) } }, `Preallocation` = { - mpg_preall <- vector(mode = "list", length = 100) + penguins_preall <- vector(mode = "list", length = 100) for (i in 1:100) { - mpg_preall[[i]] <- mpg + penguins_preall[[i]] <- penguins } - mpg_preall <- bind_rows(mpg_preall) + penguins_preall <- bind_rows(penguins_preall) } ) autoplot(times) ``` -Here, preallocating space for each data frame prior to binding together cuts the computation time by a factor of 30. +Here, preallocating space for each data frame prior to binding together cuts the computation time by a factor of 10. ## Exercise: write a `for` loop diff --git a/content/notes/iteration/index.md b/content/notes/iteration/index.md index 999308b9..b55c1209 100644 --- a/content/notes/iteration/index.md +++ b/content/notes/iteration/index.md @@ -20,6 +20,8 @@ menu: ```r library(tidyverse) library(rcfss) +library(palmerpenguins) + set.seed(1234) theme_set(theme_minimal()) ``` @@ -178,31 +180,31 @@ This is the code that actually performs the desired calculations. It runs multip If you don't preallocate space for the output, each time the `for` loop iterates, it makes a copy of the output and appends the new value at the end. Copying data takes time and memory. If the output is preallocated space, the loop simply fills in the slots with the correct values. -Consider the following task: duplicate the data frame `mpg` 100 times and bind them together into a single data frame. We can accomplish the latter task using `bind_rows()`, and use a `for` loop to create 100 copies of `mpg`. What is the difference if we preallocate space for the output as opposed to just copying and extending the data frame each time? +Consider the following task: duplicate the data frame `palmerpenguins::penguins` 100 times and bind them together into a single data frame. We can accomplish the latter task using `bind_rows()`, and use a `for` loop to create 100 copies of `penguins`. What is the difference if we preallocate space for the output as opposed to just copying and extending the data frame each time? ```r # no preallocation -mpg_no_preall <- tibble() +penguins_no_preall <- tibble() for(i in 1:100){ - mpg_no_preall <- bind_rows(mpg_no_preall, mpg) + penguins_no_preall <- bind_rows(penguins_no_preall, penguins) } # with preallocation using a list -mpg_preall <- vector(mode = "list", length = 100) +penguins_preall <- vector(mode = "list", length = 100) for(i in 1:100){ - mpg_preall[[i]] <- mpg + penguins_preall[[i]] <- penguins } -mpg_preall <- bind_rows(mpg_preall) +penguins_preall <- bind_rows(penguins_preall) ``` Let's compare the time it takes to complete each of these loops by replicating each example 100 times and measuring how long it takes for the expression to evaluate. -Here, preallocating space for each data frame prior to binding together cuts the computation time by a factor of 30. +Here, preallocating space for each data frame prior to binding together cuts the computation time by a factor of 10. ## Exercise: write a `for` loop @@ -891,7 +893,7 @@ devtools::session_info() ``` ## ─ Session info ─────────────────────────────────────────────────────────────── ## setting value -## version R version 4.0.4 (2021-02-15) +## version R version 4.1.0 (2021-05-18) ## os macOS Big Sur 10.16 ## system x86_64, darwin17.0 ## ui X11 @@ -899,89 +901,95 @@ devtools::session_info() ## collate en_US.UTF-8 ## ctype en_US.UTF-8 ## tz America/Chicago -## date 2021-05-25 +## date 2022-01-06 ## ## ─ Packages ─────────────────────────────────────────────────────────────────── -## package * version date lib source -## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) -## backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.2) -## blogdown 1.3 2021-04-14 [1] CRAN (R 4.0.2) -## bookdown 0.22 2021-04-22 [1] CRAN (R 4.0.2) -## broom 0.7.6 2021-04-05 [1] CRAN (R 4.0.4) -## bslib 0.2.5 2021-05-12 [1] CRAN (R 4.0.4) -## cachem 1.0.5 2021-05-15 [1] CRAN (R 4.0.2) -## callr 3.7.0 2021-04-20 [1] CRAN (R 4.0.2) -## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.0.0) -## cli 2.5.0 2021-04-26 [1] CRAN (R 4.0.2) -## colorspace 2.0-1 2021-05-04 [1] CRAN (R 4.0.2) -## crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.2) -## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.0.2) -## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.0.4) -## desc 1.3.0 2021-03-05 [1] CRAN (R 4.0.2) -## devtools 2.4.1 2021-05-05 [1] CRAN (R 4.0.2) -## digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2) -## dplyr * 1.0.6 2021-05-05 [1] CRAN (R 4.0.2) -## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.0.2) -## evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) -## fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.2) -## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.2) -## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.0.2) -## fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) -## generics 0.1.0 2020-10-31 [1] CRAN (R 4.0.2) -## ggplot2 * 3.3.3 2020-12-30 [1] CRAN (R 4.0.2) -## glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) -## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.0) -## haven 2.4.1 2021-04-23 [1] CRAN (R 4.0.2) -## here 1.0.1 2020-12-13 [1] CRAN (R 4.0.2) -## hms 1.1.0 2021-05-17 [1] CRAN (R 4.0.4) -## htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2) -## httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) -## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.0.2) -## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.0.2) -## knitr 1.33 2021-04-24 [1] CRAN (R 4.0.2) -## lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.2) -## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.0.2) -## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.2) -## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.0.2) -## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.0.0) -## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0) -## pillar 1.6.1 2021-05-16 [1] CRAN (R 4.0.4) -## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.0.2) -## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) -## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.0.2) -## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0) -## processx 3.5.2 2021-04-30 [1] CRAN (R 4.0.2) -## ps 1.6.0 2021-02-28 [1] CRAN (R 4.0.2) -## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) -## R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2) -## rcfss * 0.2.1 2020-12-08 [1] local -## Rcpp 1.0.6 2021-01-15 [1] CRAN (R 4.0.2) -## readr * 1.4.0 2020-10-05 [1] CRAN (R 4.0.2) -## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.0.0) -## remotes 2.3.0 2021-04-01 [1] CRAN (R 4.0.2) -## reprex 2.0.0 2021-04-02 [1] CRAN (R 4.0.2) -## rlang 0.4.11 2021-04-30 [1] CRAN (R 4.0.2) -## rmarkdown 2.8 2021-05-07 [1] CRAN (R 4.0.2) -## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.0.2) -## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.0.2) -## rvest 1.0.0 2021-03-09 [1] CRAN (R 4.0.2) -## sass 0.4.0 2021-05-12 [1] CRAN (R 4.0.2) -## scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.0) -## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) -## stringi 1.6.1 2021-05-10 [1] CRAN (R 4.0.2) -## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) -## testthat 3.0.2 2021-02-14 [1] CRAN (R 4.0.2) -## tibble * 3.1.1 2021-04-18 [1] CRAN (R 4.0.2) -## tidyr * 1.1.3 2021-03-03 [1] CRAN (R 4.0.2) -## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.0.2) -## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.0.2) -## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.0.2) -## utf8 1.2.1 2021-03-12 [1] CRAN (R 4.0.2) -## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.0.2) -## withr 2.4.2 2021-04-18 [1] CRAN (R 4.0.2) -## xfun 0.23 2021-05-15 [1] CRAN (R 4.0.2) -## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.0) -## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) +## package * version date lib source +## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) +## backports 1.2.1 2020-12-09 [1] CRAN (R 4.1.0) +## blogdown 1.7 2021-12-19 [1] CRAN (R 4.1.0) +## bookdown 0.23 2021-08-13 [1] CRAN (R 4.1.0) +## broom 0.7.9 2021-07-27 [1] CRAN (R 4.1.0) +## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.0) +## cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) +## callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0) +## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0) +## cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.0) +## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.1.0) +## colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) +## crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0) +## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) +## dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) +## desc 1.3.0 2021-03-05 [1] CRAN (R 4.1.0) +## devtools 2.4.2 2021-06-07 [1] CRAN (R 4.1.0) +## digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.0) +## dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) +## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) +## evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) +## fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) +## farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0) +## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) +## forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0) +## fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) +## generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0) +## ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) +## glue 1.5.0 2021-11-07 [1] CRAN (R 4.1.0) +## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) +## haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.0) +## here 1.0.1 2020-12-13 [1] CRAN (R 4.1.0) +## highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) +## hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0) +## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0) +## httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) +## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0) +## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) +## knitr 1.33 2021-04-24 [1] CRAN (R 4.1.0) +## lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0) +## lubridate 1.7.10 2021-02-26 [1] CRAN (R 4.1.0) +## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) +## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) +## microbenchmark * 1.4-7 2019-09-24 [1] CRAN (R 4.1.0) +## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0) +## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) +## palmerpenguins * 0.1.0 2020-07-23 [1] CRAN (R 4.1.0) +## pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0) +## pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.1.0) +## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) +## pkgload 1.2.1 2021-04-06 [1] CRAN (R 4.1.0) +## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) +## processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0) +## ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0) +## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) +## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) +## rcfss * 0.2.1 2021-11-15 [1] local +## Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) +## readr * 2.0.2 2021-09-27 [1] CRAN (R 4.1.0) +## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0) +## remotes 2.4.0 2021-06-02 [1] CRAN (R 4.1.0) +## reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) +## rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0) +## rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0) +## rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0) +## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) +## rvest 1.0.1 2021-07-26 [1] CRAN (R 4.1.0) +## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.0) +## scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) +## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.1.0) +## stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.0) +## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) +## testthat 3.0.4 2021-07-01 [1] CRAN (R 4.1.0) +## tibble * 3.1.6 2021-11-07 [1] CRAN (R 4.1.0) +## tidyr * 1.1.4 2021-09-27 [1] CRAN (R 4.1.0) +## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) +## tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.0) +## tzdb 0.1.2 2021-07-20 [1] CRAN (R 4.1.0) +## usethis 2.0.1 2021-02-10 [1] CRAN (R 4.1.0) +## utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) +## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) +## withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) +## xfun 0.29 2021-12-14 [1] CRAN (R 4.1.0) +## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) +## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) ## -## [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library +## [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library ``` diff --git a/content/notes/iteration/index_files/figure-html/preallocate-1.png b/content/notes/iteration/index_files/figure-html/preallocate-1.png index 4a2891c9..fbc91e41 100644 Binary files a/content/notes/iteration/index_files/figure-html/preallocate-1.png and b/content/notes/iteration/index_files/figure-html/preallocate-1.png differ diff --git a/static/slides/introduction-to-computing-for-the-social-sciences/index.Rmd b/static/slides/introduction-to-computing-for-the-social-sciences/index.Rmd index 091fc727..687b5c94 100644 --- a/static/slides/introduction-to-computing-for-the-social-sciences/index.Rmd +++ b/static/slides/introduction-to-computing-for-the-social-sciences/index.Rmd @@ -62,33 +62,31 @@ print("Hello world!") --- -```{r auto-example, eval = FALSE} +```{r penguins-example, eval = FALSE, warning = FALSE, message = FALSE} # load packages library(tidyverse) +library(palmerpenguins) library(broom) # estimate and print the linear model -lm(hwy ~ displ, data = mpg) %>% +lm(body_mass_g ~ flipper_length_mm, data = penguins) %>% tidy() %>% - mutate(term = c("Intercept", "Engine displacement (in liters)")) %>% - knitr::kable(digits = 2, - col.names = c("Variable", "Estimate", "Standard Error", - "T-statistic", "P-Value"), - format = "html") + mutate(term = c("Intercept", "Flipper length (millimeters)")) %>% + knitr::kable(digits = 2, col.names = c("Variable", "Estimate", "Standard Error", + "T-statistic", "P-Value")) # visualize the relationship -ggplot(data = mpg, mapping = aes(displ, hwy)) + - geom_point(mapping = aes(color = class)) + - geom_smooth(method = "lm", se = FALSE, - color = "black", alpha = .25) + - labs(x = "Engine displacement (in liters)", - y = "Highway miles per gallon", - color = "Car type") +ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g)) + + geom_point(mapping = aes(color = species)) + + geom_smooth(method = "lm", se = FALSE, color = "black", alpha = .25) + + labs(x = "Flipper length (in millimeters)", + y = "Body mass (in grams)", + color = "Species") ``` --- -```{r auto-example, fig.height = 6, echo = FALSE, message = FALSE} +```{r penguins-example, fig.height = 6, echo = FALSE, warning = FALSE, message = FALSE} ``` --- diff --git a/static/slides/introduction-to-computing-for-the-social-sciences/index.html b/static/slides/introduction-to-computing-for-the-social-sciences/index.html index 1ad71aac..7de44e54 100644 --- a/static/slides/introduction-to-computing-for-the-social-sciences/index.html +++ b/static/slides/introduction-to-computing-for-the-social-sciences/index.html @@ -72,58 +72,34 @@ ```r # load packages library(tidyverse) +library(palmerpenguins) library(broom) # estimate and print the linear model -lm(hwy ~ displ, data = mpg) %>% +lm(body_mass_g ~ flipper_length_mm, data = penguins) %>% tidy() %>% - mutate(term = c("Intercept", "Engine displacement (in liters)")) %>% - knitr::kable(digits = 2, - col.names = c("Variable", "Estimate", "Standard Error", - "T-statistic", "P-Value"), - format = "html") + mutate(term = c("Intercept", "Flipper length (millimeters)")) %>% + knitr::kable(digits = 2, col.names = c("Variable", "Estimate", "Standard Error", + "T-statistic", "P-Value")) # visualize the relationship -ggplot(data = mpg, mapping = aes(displ, hwy)) + - geom_point(mapping = aes(color = class)) + - geom_smooth(method = "lm", se = FALSE, - color = "black", alpha = .25) + - labs(x = "Engine displacement (in liters)", - y = "Highway miles per gallon", - color = "Car type") +ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g)) + + geom_point(mapping = aes(color = species)) + + geom_smooth(method = "lm", se = FALSE, color = "black", alpha = .25) + + labs(x = "Flipper length (in millimeters)", + y = "Body mass (in grams)", + color = "Species") ``` --- -<table> - <thead> - <tr> - <th style="text-align:left;"> Variable </th> - <th style="text-align:right;"> Estimate </th> - <th style="text-align:right;"> Standard Error </th> - <th style="text-align:right;"> T-statistic </th> - <th style="text-align:right;"> P-Value </th> - </tr> - </thead> -<tbody> - <tr> - <td style="text-align:left;"> Intercept </td> - <td style="text-align:right;"> 35.70 </td> - <td style="text-align:right;"> 0.72 </td> - <td style="text-align:right;"> 49.55 </td> - <td style="text-align:right;"> 0 </td> - </tr> - <tr> - <td style="text-align:left;"> Engine displacement (in liters) </td> - <td style="text-align:right;"> -3.53 </td> - <td style="text-align:right;"> 0.19 </td> - <td style="text-align:right;"> -18.15 </td> - <td style="text-align:right;"> 0 </td> - </tr> -</tbody> -</table> - -<img src="index_files/figure-html/auto-example-1.png" width="864" style="display: block; margin: auto;" /> + +|Variable | Estimate| Standard Error| T-statistic| P-Value| +|:----------------------------|--------:|--------------:|-----------:|-------:| +|Intercept | -5780.83| 305.81| -18.90| 0| +|Flipper length (millimeters) | 49.69| 1.52| 32.72| 0| + +<img src="index_files/figure-html/penguins-example-1.png" width="864" style="display: block; margin: auto;" /> --- diff --git a/static/slides/introduction-to-computing-for-the-social-sciences/index_files/figure-html/penguins-example-1.png b/static/slides/introduction-to-computing-for-the-social-sciences/index_files/figure-html/penguins-example-1.png new file mode 100644 index 00000000..38500c0f Binary files /dev/null and b/static/slides/introduction-to-computing-for-the-social-sciences/index_files/figure-html/penguins-example-1.png differ diff --git a/static/slides/introduction-to-computing-for-the-social-sciences/index_files/figure-html/sesame-good-1.png b/static/slides/introduction-to-computing-for-the-social-sciences/index_files/figure-html/sesame-good-1.png index 08add97f..d7dd4c60 100644 Binary files a/static/slides/introduction-to-computing-for-the-social-sciences/index_files/figure-html/sesame-good-1.png and b/static/slides/introduction-to-computing-for-the-social-sciences/index_files/figure-html/sesame-good-1.png differ