Skip to content

Commit

Permalink
extra chapter notes 8 and 13
Browse files Browse the repository at this point in the history
  • Loading branch information
avehtari committed Dec 12, 2019
1 parent eca25b9 commit 4584582
Show file tree
Hide file tree
Showing 4 changed files with 261 additions and 0 deletions.
Binary file added chapter_notes/BDA_notes_ch13.pdf
Binary file not shown.
138 changes: 138 additions & 0 deletions chapter_notes/BDA_notes_ch13.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
\documentclass[a4paper,11pt,english]{article}

\usepackage{babel}
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{times}
\usepackage{amsmath}
\usepackage{microtype}
\usepackage{url}
\urlstyle{same}

\usepackage[bookmarks=false]{hyperref}
\hypersetup{%
bookmarksopen=true,
bookmarksnumbered=true,
pdftitle={Bayesian data analysis},
pdfsubject={Comments},
pdfauthor={Aki Vehtari},
pdfkeywords={Bayesian probability theory, Bayesian inference, Bayesian data analysis},
pdfstartview={FitH -32768}
}


% if not draft, smaller printable area makes the paper more readable
\topmargin -4mm
\oddsidemargin 0mm
\textheight 225mm
\textwidth 160mm

%\parskip=\baselineskip
\def\eff{\mathrm{rep}}

\DeclareMathOperator{\E}{E}
\DeclareMathOperator{\Var}{Var}
\DeclareMathOperator{\var}{var}
\DeclareMathOperator{\Sd}{Sd}
\DeclareMathOperator{\sd}{sd}
\DeclareMathOperator{\Bin}{Bin}
\DeclareMathOperator{\Beta}{Beta}
\DeclareMathOperator{\Invchi2}{Inv-\chi^2}
\DeclareMathOperator{\NInvchi2}{N-Inv-\chi^2}
\DeclareMathOperator{\logit}{logit}
\DeclareMathOperator{\N}{N}
\DeclareMathOperator{\U}{U}
\DeclareMathOperator{\tr}{tr}
%\DeclareMathOperator{\Pr}{Pr}
\DeclareMathOperator{\trace}{trace}
\DeclareMathOperator{\rep}{\mathrm{rep}}

\pagestyle{empty}

\begin{document}
\thispagestyle{empty}

\section*{Bayesian data analysis -- reading instructions 13}
\smallskip
{\bf Aki Vehtari}
\smallskip

\subsection*{Chapter 13: Modal and distributional approximations}

Chapter 4 presented normal distribution approximation at the mode (aka
Laplace approximation). Chapter 13 discusses more about distributional
approximations.

Outline of the chapter 13
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item[13.1] Finding posterior modes
\begin{itemize}
\item[-] Newton's method is very fast if the distribution is close to
normal and the computation of the second derivatives is fast
\item[-] Stan uses limited-memory Broyden-Fletcher-Goldfarb-Shannon
(L-BFGS) which is a quasi-Newton method which needs only the first
derivatives (provided by Stan autodiff). L-BFGS is known for good
performance for wide variety of functions.
\end{itemize}
\item[13.2] Boundary-avoiding priors for modal summaries
\begin{itemize}
\item[-] Although full integration is preferred, sometimes optimization
of some parameters may be sufficient and faster, and then
boundary-avoiding priors maybe useful.
\end{itemize}
\item[13.3] Normal and related mixture approximations
\begin{itemize}
\item[-] Discusses how the normal approximation can be used to
approximate integrals of a a smooth function times the posterior.
\item[-] Discusses mixture and $t$ approximations.
\end{itemize}
\item[13.4] Finding marginal posterior modes using EM
\begin{itemize}
\item[-] Expectation maximization is less important in the time of
efficient probabilistic programming frameworks, but can be
sometimes useful for extra efficiency.
\end{itemize}
\item[13.5] Conditional and marginal posterior approximations
\begin{itemize}
\item[-] Even in the time of efficient probabilistic programming, the
methods discussed in this section can produce very big speedups
for a big set of commonly used models. The methods discussed are
important part of popular INLA software and are coming also to
Stan to speedup latent Gaussian variable models.
\end{itemize}
\item[13.6] Example: hierarchical normal model
\item[13.7] Variational inference
\begin{itemize}
\item[-] Variational inference (VI) is very popular in machine
learning, and this section presents it in terms of BDA. Auto-diff
variational inference in Stan was developed after BDA3 was
published.
\end{itemize}
\item[13.8] Expectation propagation
\begin{itemize}
\item[-] Practical efficient computation for expectation propagation
(EP) is applicable for more limited set of models than post-BDA3
black-box VI, but for those models EP provides better posterior
approximation. Variants of EP can be used for parallelization of
any Bayesian computation for hierarchical models.
\end{itemize}
\item[13.9] Other approximations
\begin{itemize}
\item[-] Just brief mentions of INLA (uses methods discussed in 13.5),
CCD (deterministic adaptive quadrature approach) and ABC
(inference when you can only sample from the generative model).
\end{itemize}
\item[13.10] Unknown normalizing factors
\begin{itemize}
\item[-] Often the normalizing factor is not needed, but it can be
estimated using importance, bridge or path sampling.
\end{itemize}
\end{list}


\end{document}

%%% Local Variables:
%%% mode: latex
%%% TeX-master: t
%%% End:
Binary file added chapter_notes/BDA_notes_ch8.pdf
Binary file not shown.
123 changes: 123 additions & 0 deletions chapter_notes/BDA_notes_ch8.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
\documentclass[a4paper,11pt,english]{article}

\usepackage{babel}
\usepackage[latin1]{inputenc}
\usepackage[T1]{fontenc}
% \usepackage[T1,mtbold,lucidacal,mtplusscr,subscriptcorrection]{mathtime}
\usepackage{times}
\usepackage{amsmath}
\usepackage{microtype}
\usepackage{url}
\urlstyle{same}

\usepackage[bookmarks=false]{hyperref}
\hypersetup{%
bookmarksopen=true,
bookmarksnumbered=true,
pdftitle={Bayesian data analysis},
pdfsubject={Comments},
pdfauthor={Aki Vehtari},
pdfkeywords={Bayesian probability theory, Bayesian inference, Bayesian data analysis},
pdfstartview={FitH -32768}
}


% if not draft, smaller printable area makes the paper more readable
\topmargin -4mm
\oddsidemargin 0mm
\textheight 225mm
\textwidth 160mm

%\parskip=\baselineskip
\def\eff{\mathrm{rep}}

\DeclareMathOperator{\E}{E}
\DeclareMathOperator{\Var}{Var}
\DeclareMathOperator{\var}{var}
\DeclareMathOperator{\Sd}{Sd}
\DeclareMathOperator{\sd}{sd}
\DeclareMathOperator{\Bin}{Bin}
\DeclareMathOperator{\Beta}{Beta}
\DeclareMathOperator{\Invchi2}{Inv-\chi^2}
\DeclareMathOperator{\NInvchi2}{N-Inv-\chi^2}
\DeclareMathOperator{\logit}{logit}
\DeclareMathOperator{\N}{N}
\DeclareMathOperator{\U}{U}
\DeclareMathOperator{\tr}{tr}
%\DeclareMathOperator{\Pr}{Pr}
\DeclareMathOperator{\trace}{trace}
\DeclareMathOperator{\rep}{\mathrm{rep}}

\pagestyle{empty}

\begin{document}
\thispagestyle{empty}

\section*{Bayesian data analysis -- reading instructions 8}
\smallskip
{\bf Aki Vehtari}
\smallskip

\subsection*{Chapter 8}

In the earlier chapters it was assumed that the data collection is
ignorable. Chapter 8 explains when data collection can be ignorable
and when we need to model also the data collection.
We don't have time to go through chapter 8 in BDA course at Aalto, but
it is highly recommended that you would read it in the end or after
the course. Most important parts are 8.1, 8.5, pp 220--222 of 8.6, and
8.8, and you can get back to the other sections later.

Outline of the chapter 8 (* denotes the most important parts)
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item 8.1 Bayesian inference requires a model for data collection (*)
\item 8.2 Data-collection models and ignorability
\item 8.3 Sample surveys
\item 8.4 Designed experiments
\item 8.5 Sensitivity and the role of randomization (*)
\item 8.6 Observational studies (* pp 220--222)
\item 8.7 Censoring and truncation (*)
\end{list}

Most important terms in the chapter
\begin{list}{$\bullet$}{\parsep=0pt\itemsep=2pt}
\item observed data
\item complete data
\item missing data
\item stability assumption
\item data model
\item inclusion model
\item complete data likelihood
\item observed data likelihood
\item finite-population and superpopulation inference
\item ignorability
\item ignorable designs
\item propensity score
\item sample surveys
\item random sampling of a finite population
\item stratified sampling
\item cluster sampling
\item designed experiments
\item complete randomization
\item randomized blocks and latin squares
\item sequntial designs
\item randomization given covariates
\item observational studies
\item censoring
\item truncation
\item missing completely at random
\end{list}

% Gelman: ``All contexts where the model is fit to data that are not
% necessarily representative of the population that is the target of
% study. The key idea is to include in the Bayesian model an inclusion
% variable with a probability distribution that represents the process
% by which data become observed.''

\end{document}


%%% Local Variables:
%%% TeX-PDF-mode: t
%%% TeX-master: t
%%% End:

0 comments on commit 4584582

Please sign in to comment.