chapter_notes/BDA_notes_ch3.tex

\documentclass[a4paper,11pt,english]{article}

\usepackage{babel}
%\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{times}
\usepackage{amsmath}
\usepackage{microtype}
\usepackage{url}
\urlstyle{same}

\usepackage[bookmarks=false]{hyperref}
\hypersetup{%
  bookmarksopen=true,
  bookmarksnumbered=true,
  pdftitle={Bayesian data analysis},
  pdfsubject={Comments},
  pdfauthor={Aki Vehtari},
  pdfkeywords={Bayesian probability theory, Bayesian inference, Bayesian data analysis},
  pdfstartview={FitH -32768},
    colorlinks=true,
    linkcolor=black,
    citecolor=black,
    filecolor=black,
    urlcolor=black,
}


% if not draft, smaller printable area makes the paper more readable
\topmargin -4mm
\oddsidemargin 0mm
\textheight 225mm
\textwidth 160mm

%\parskip=\baselineskip

\DeclareMathOperator{\E}{E}
\DeclareMathOperator{\Var}{Var}
\DeclareMathOperator{\var}{var}
\DeclareMathOperator{\Sd}{Sd}
\DeclareMathOperator{\sd}{sd}
\DeclareMathOperator{\Bin}{Bin}
\DeclareMathOperator{\Beta}{Beta}
\DeclareMathOperator{\Invchi2}{Inv-\chi^2}
\DeclareMathOperator{\NInvchi2}{N-Inv-\chi^2}
\DeclareMathOperator{\logit}{logit}
\DeclareMathOperator{\N}{N}
\DeclareMathOperator{\U}{U}
\DeclareMathOperator{\tr}{tr}
%\DeclareMathOperator{\Pr}{Pr}
\DeclareMathOperator{\trace}{trace}

\pagestyle{empty}

\begin{document}
\thispagestyle{empty}

\section*{Bayesian data analysis -- reading instructions 3} 
\smallskip
{\bf Aki Vehtari}
\smallskip

\subsection*{Chapter 3}

Outline of the chapter 3
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item 3.1 Marginalisation
\item 3.2 Normal distribution with a noninformative prior (very important)
\item 3.3 Normal distribution with a conjugate prior (very important)
\item 3.4 Multinomial model (can be skipped)
\item 3.5 Multivariate normal with known variance (needed later)
\item 3.6 Multivariate normal with unknown variance (glance through)
\item 3.7 Bioassay example (very important, related to one of the exercises)
\item 3.8 Summary (summary)
\end{list}

Normal model is used a lot as a building block of the models in the
later chapters, so it is important to learn it now.
Bioassay example is good example used to illustrate many important
concepts and it is used in several exercises over the course.

Demos (\url{https://github.com/avehtari/BDA_R_demos} and \url{https://github.com/avehtari/BDA_py_demos})
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item demo3\_1: visualise joint density and marginal densities of
  posterior of normal distribution with unknown mean and variance
\item demo3\_2: visualise factored sampling and corresponding
  marginal and conditional density
\item demo3\_3: visualise marginal distribution of mu as a mixture of normals
\item demo3\_4: visualise sampling from the posterior predictive distribution
\item demo3\_5: visualise Newcomb's data
\item demo3\_6: visualise posterior in bioassay example
\end{list}

Find all the terms and symbols listed below. When reading the chapter,
write down questions related to things unclear for you or things you
think might be unclear for others.  See also the additional comments below.
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item marginal distribution/density
\item conditional distribution/density
\item joint distribution/density
\item nuisance parameter
\item mixture
\item normal distribution with a noninformative prior
\item normal distribution with a conjugate prior
\item sample variance
\item sufficient statistics
\item $\mu$,$\sigma^2$,$\bar{y}$,$s^2$
\item a simple normal integral
\item $\Invchi2$
\item factored density
\item $t_{n-1}$
\item degrees of freedom
\item posterior predictive distribution
\item to draw
\item $\NInvchi2$
\item variance matrix $\Sigma$
\item nonconjugate model
\item generalized linear model
\item exchangeable
\item binomial model
\item logistic transformation
\item density at a grid
\end{list}

\subsection*{Conjugate prior for Gaussian distribution}

BDA p. 67 (BDA3) mentions that the conjugate prior for Gaussian
distribution has to have a product form $p(\sigma^2)p(\mu|\sigma^2)$.
The book refers to (3.2) and the following discussion. As additional
hint is useful to think the relation of terms $(n-1)s^2$ and
$n(\bar{y}-\mu)^2$ in 3.2 to equations 3.3 and 3.4.

\subsection*{Trace of square matrix}

Trace of square matrix, $\trace$, $\tr A$, $\trace(A)$, $\tr(A)$, is
the sum of diagonal elements. To derive equation 3.11 the following
property has been used $\tr(ABC) = \tr(CAB) = \tr(BCA)$.

\subsection*{History and naming of distributions}

See \emph{Earliest Known Uses of Some of the Words of Mathematics}
\url{http://jeff560.tripod.com/mathword.html}.

\subsection*{The number of  required Monte Carlo draws}

This will discussed in chapter 10. Meanwhile, e.g., 1000 draws is sufficient.

\subsection*{Bioassay}

Bioassay example is is an example of very common statistical inference
task typical, for example, medicine, pharmacology, health care,
cognitive science, genomics, industrial processes etc.

The example is from Racine et al (1986) (see ref in the end of the
BDA3). Swiss company makes classification of chemicals to different
toxicity categories defined by authorities (like EU). Toxicity
classification is based on lethal dose 50\% (LD50) which tells what
amount of chemical kills 50\% of the subjects. Smaller the LD50 more
lethal the chemical is. The original paper mentions "1983 Swiss poison
Regulation" which defines following categories for chemicals orally
given to rats (mg/ml)\\
\begin{tabular}{c c}
Class & LD50 \\
1 & <5 \\
     2  &         5-50 \\
     3  &        50-500 \\
     4  &       500-2000 \\
     5  &      2000-5000 
\end{tabular}\\
To reduce the number of rats needed in the experiments, the company
started to use Bayesian methods. The paper mentions that in those days
use of just 20 rats to define the classification was very little.
%
Book gives LD50 in log(g/ml). When the result from demo3\_6 is
transformed to scale mg/ml, we see that the mean LD50 is about 900 and
$p(500<\text{LD50}<2000)\approx 0.99$. Thus, the tested chemical can be
classified as category 4 toxic.

Note that the chemical testing is moving away from using rats and
other animals to using, for example, human cells grown in chips,
tissue models and human blood cells. The human-cell based approaches
are also more accurate to predict the effect for humans.

$\logit$ transformation can be justified information theoretically
when binomial likelihood is used.

Example codes in demo3\_6 can be helpful in exercises related to
Bioassay example.

\subsection*{Bayesian vs. frequentist statements in two group comparisons}

Frank Harrell's recommendations how to state results in two group
comparisons are excellent
\url{http://www.fharrell.com/2017/10/bayesian-vs-frequentist-statements.html}.

\end{document}

%%% Local Variables: 
%%% TeX-PDF-mode: t
%%% TeX-master: t
%%% End: