-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathYOURNAME-lab02.Rmd
103 lines (68 loc) · 3.5 KB
/
YOURNAME-lab02.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
title: "YOUR NAME: Lab 02 for 431"
author: "YOUR NAME"
date: "`r Sys.Date()`"
output:
html_document:
toc: yes
code_folding: show
---
```{r setup, include=FALSE, message = FALSE}
knitr::opts_chunk$set(comment=NA)
options(width = 70)
```
## Setup
```{r load_packages, message = FALSE, warning = FALSE}
library(tidyverse)
## always need tidyverse, can include other packages too
## which should always be inserted above the tidyverse
```
## Import the `lab02_counties.csv` data
```{r}
lab02_data <- read_csv("lab02_counties.csv")
```
## First Steps
Make sure you have changed the title and author of this, back in the YAML section at the very top of this file, and make sure you change this file's name by inserting your actual name, rather than just calling it `YOURNAME-lab02.Rmd`. Make the document as attractive as you can. Finally, make sure you have deleted all of the instructions, including these before you submit your final work.
# Question 1
Begin typing your answer to Question 1 here. Use complete sentences to explain what the result of your code means in terms of answering the question we asked.
If you need to build some code to answer a question (as you will in Questions 1-3 and 5-6), create a chunk of code like the one below, which creates a new data frame in R (called a tibble) that contains the 83 counties within the states of New York and New Jersey. Of course, this example doesn't address our question 1, but may be helpful to you anyway.
```{r q1_create_nynj}
nynj_counties <- lab02_data %>%
filter(state == "NY" | state == "NJ")
nrow(nynj_counties)
```
# Question 2
Here's where you should type in your answer to Question 2. You will need to include code again. Here's an approach that tells us which state (of New York or New Jersey) has more counties.
```{r}
nynj_counties %>% count(state)
```
# Question 3
Here's where you should type in your answer to Question 3. You will need to include some code again. Here's an approach that yields the percentage of female residents for Suffolk County in the state of New York, where Dr. Love was born.
```{r q3_metro_suffolk}
nynj_counties %>%
filter(state == "NY") %>%
filter(county_name == "Suffolk County") %>%
select(state, county_name, female_pct)
```
# Question 4
Your answer to Question 4 goes here. Here's a histogram of the natural logarithm of the percentage of county residents who are female across the counties of New York and New Jersey. I don't see any good reason to take a natural logarithm here, and of course, again, that's not what you'll need to be doing, and the "improvements" I've made here are pretty weak. I'm hoping you'll do better.
```{r q3_popdensity_histogram}
ggplot(nynj_counties, aes(x = log(female_pct))) +
geom_histogram(bins = 15, col = "white") +
theme_bw() +
labs(title = "This deserves a better title.",
y = "Number of Counties")
```
# Question 5
Replace this text with your response to Question 5.
# Question 6
Your code (and a brief description of your approach, written in complete sentences) for Question 6 goes here.
# Question 7
Replace this text with your response to Question 7.
# Question 8
Replace this text with your response to Question 8.
# Session Information
Adding a `sessionInfo()` chunk at the end of your document helps with reproducibility. Take a look to see what it produces, and include the command at the end of this Lab, please.
```{r}
sessionInfo()
```