forked from avehtari/BDA_R_demos
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdemo6_3.Rmd
70 lines (56 loc) · 1.64 KB
/
demo6_3.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
title: "Bayesian data analysis demo 6.3"
author: "Aki Vehtari, Markus Paasiniemi"
date: "`r format(Sys.Date())`"
output:
html_document:
theme: readable
code_download: true
---
## Posterior predictive checking demo
Light speed example with poorly chosen test statistic
ggplot2 is used for plotting, tidyr for manipulating data frames
```{r setup, message=FALSE, error=FALSE, warning=FALSE}
library(ggplot2)
theme_set(theme_minimal())
library(tidyr)
library(latex2exp)
library(rprojroot)
root<-has_file(".BDA_R_demos_root")$make_fix_file()
```
Data
```{r }
y <- read.table(root("demos_ch6","light.txt"))$V1
```
Sufficient statistics
```{r }
n <- length(y)
s <- sd(y)
my <- mean(y)
```
Replicated data
```{r }
sampt <- replicate(1000, rt(n, n-1)*sqrt(1+1/n)*s+my) %>%
as.data.frame()
```
Test statistic here is variance, which is not a good choice as the
model has variance parameter and the posterior of that parameter
has been updated by the same data used now in checking.
```{r }
sampt_vars <- data.frame(x = sapply(sampt, var))
```
Plot test statistics for the data and replicates.
Vertical line corresponds to the original data, and
the histogram to the replicate data.
```{r }
title1 <- 'Light speed example with poorly chosen test statistic
Pr(T(yrep,theta) <= T(y,theta)|y)=0.42'
ggplot(data = sampt_vars) +
geom_histogram(aes(x = x), fill = 'steelblue',
color = 'black', binwidth = 6) +
geom_vline(aes(xintercept = x), data = data.frame(x = s^2),
color = 'red') +
labs(x = TeX('Variance of \\mathit{y} and \\mathit{y}^{\\mathrm{rep}}'),
y = '', title = title1) +
scale_y_continuous(breaks=NULL)
```