forked from rdpeng/RepData_PeerAssessment1
-
Notifications
You must be signed in to change notification settings - Fork 0
/
PA1_template.Rmd
69 lines (61 loc) · 2.26 KB
/
PA1_template.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
title: "Reproducible Research: Peer Assessment 1"
output:
html_document:
keep_md: true
---
## Loading and preprocessing the data
```{r, echo=TRUE}
setwd('C:/Users/kfollmer/Documents/R/Data Science Course/Reproducible Research/RepData_PeerAssessment1')
data <- read.csv('activity.csv')
dat_clean <- data[!is.na(data$steps),]
```
## What is mean total number of steps taken per day?
First, I calculate the sum of steps by the different dates. Then I plot that information on a histogram. Lastly, I calculated some summary statistics
```{r stepSummary}
d1 <- aggregate(dat_clean$steps,list(date=dat_clean$date),sum)
as.numeric(d1$x)
hist(d1$x, xlab = "Total Steps", main = "Histogram of Total Steps Taken by Day Total")
mean(d1$x)
median(d1$x)
```
## What is the average daily activity pattern?
```{r linegraph}
d3 <- cbind(dat_clean$interval,dat_clean$steps)
d4 <- aggregate(d3, list(Interval = d3[,1]),mean)
d4 <- d4[,2:3]
colnames(d4) <- c("interval", "steps")
plot(d4$interval, d4$steps,xlab = "Interval",ylab = "Mean of Steps",type = 'l')
maxInterval <- d4[d4$steps==max(d4$steps),]
maxInterval
```
## Imputing missing values
How many observations are missing steps values?
```{r missing}
nrow(data[is.na(data$steps),])
```
Add the mean value for that interval to account for the missing intervals
```{r edit}
edit <- data[is.na(data$steps),2:3]
editUpdate <- merge(edit,d4,by="interval")
library(data.table)
editUpdate <- data.table(editUpdate)
setcolorder(editUpdate,c("steps","date","interval"))
proxyDat <- rbind(dat_clean,editUpdate, fill=TRUE)
```
## Are there differences in activity patterns between weekdays and weekends?
First, we set up the data
```{r dowLabel}
d5 <- cbind(proxyDat,dow = weekdays(as.Date(proxyDat$date)))
dow <- c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday")
label <- c("weekday","weekday","weekday","weekday","weekday","weekend","weekend")
lookup <- cbind(dow,label)
d6 <- merge(d5, lookup, by="dow")
```
Then, I plot using lattice
```{r weekdayPlot}
library(lattice)
d6 <- data.frame(d6)
d6a <- aggregate(steps~interval+label,d6,mean)
xyplot(d6a$steps~d6a$interval|d6a$label, par = c(2,1),type = 'l', xlab = "Interval", ylab = "Average Steps")
```