title

author

date

output

Lab 4

Daniel Lewinsohn and Henry Jones

2023-05-11

html_document

keep_md
true

results <- read.csv("/Users/henryjones/Desktop/Math_CC/MA_417/results_2019.csv")
results2 <- subset(results, neutral == FALSE)
results2$home_win <- 1*(results2$home_score - results2$away_score > 0)
mean(results2$home_win)

## [1] 0.5286236

Exercise 1

Use an exact sampling distribution for p̂= X to construct a 95% confidence interval for p. Hint: Use the binomial distribution in your computations.

library(rootSolve)
y <- sum(results2$home_win)
n <- length(results2$home_win)
a <- .05
f1 <- function(x){pbinom(y-1, n, x) - 1+a/2}
f2 <- function(x){pbinom(y, n, x) - (a/2)}
uniroot(f1, interval=c(0,1))

## $root
## [1] 0.4938007
## 
## $f.root
## [1] 5.920902e-05
## 
## $iter
## [1] 13
## 
## $init.it
## [1] NA
## 
## $estim.prec
## [1] 6.103516e-05

uniroot(f2, interval=c(0,1))

## $root
## [1] 0.5632077
## 
## $f.root
## [1] 5.106583e-05
## 
## $iter
## [1] 13
## 
## $init.it
## [1] NA
## 
## $estim.prec
## [1] 6.103516e-05

Exercise 2

Use a large-sample approximation for the sampling distribution of p̂=X to construct a 95% confidence interval for p.

X_bar follow N(p, sigma2 / n).

c <- qnorm(.975, 0, 1)

x_bar <- mean(results2$home_win)
var_est <- var(results2$home_win)
n <- length(results2$home_win)
lower_interval <- x_bar - ((c * sqrt(var_est)) / sqrt(n))
higher_interval <- x_bar + ((c * sqrt(var_est)) / sqrt(n))
print(lower_interval)

## [1] 0.4944573

print(higher_interval)

## [1] 0.56279

Our low and high bounds for our 95% confidence interval.

Exercise 3

set.seed(8)
X_bars <- numeric(20)

n <- length(results2$home_win)

for(i in 1:1000){
  samp <- sample(results2$home_win, n, replace=T)
  X_bars[i] <- mean(samp)
}

q_one <- quantile(X_bars, .025)
q_two <- quantile(X_bars, .975)

print(q_one)

##      2.5% 
## 0.4957369

print(q_two)

##     97.5% 
## 0.5651949

Exercise 4

We use Bernoulli with sample mean as our distribution to generate samples.

set.seed(8)
p <- mean(results2$home_win) # use sample mean
X_bars <- numeric(20)

n <- length(results2$home_win)

for(i in 1:1000){
  samp <- rbinom(n, 1, p)
  X_bars[i] <- mean(samp)
}

q_one <- quantile(X_bars, .025)
q_two <- quantile(X_bars, .975)

print(q_one)

##      2.5% 
## 0.4957065

print(q_two)

##     97.5% 
## 0.5615104

Exercise 5

# Beta(1,1) is uniform
a <- 1
b <- 1

post_a <- a + sum(results2$home_win)
post_b <- b + length(results2$home_win) - sum(results2$home_win)

c_one <- qbeta(.025, post_a, post_b)
c_two <- qbeta(.975, post_a, post_b)

print(c_one)

## [1] 0.4944136

print(c_two)

## [1] 0.5625631

Use one with informative prior that has about expected value equal to sample mean. Beta(10, 10)

a <- 10
b <- 10

post_a <- a + sum(results2$home_win)
post_b <- b + length(results2$home_win) - sum(results2$home_win)

c_one <- qbeta(.025, post_a, post_b)
c_two <- qbeta(.975, post_a, post_b)

print(c_one)

## [1] 0.4941689

print(c_two)

## [1] 0.5615911

Exercise 6

Large sample approximation CI: [0.4944573, 0.56279]

Non-parametric bootstrap CI: [0.4957369, 0.5651949]

Parametric bootstrap CI: [0.4957065, 0.5615104]

Bayesian with uninformative prior CI: [0.4944136, 0.5625631]

Bayesian with informative prior CI: [0.4941689, 0.5615911]

There are not any huge differences between the confidence/credible intervals from all of the methods. This likely due to the large sample size. There is not a way to know which procedure is best for this problem, as we do not know the true value of p, because this is real data. With small n, the large sample and bootstraps would not work very well. However, the Bayesian approaches could still do well, especially with an informative prior. The exact solution would still work well with small n.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lab4.md

lab4.md

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6

Files

lab4.md

Latest commit

History

lab4.md

File metadata and controls

Exercise 1

Exercise 2

Exercise 3

Exercise 4

Exercise 5

Exercise 6