Skip to content

Commit

Permalink
Added material for week 11
Browse files Browse the repository at this point in the history
  • Loading branch information
wmutschl committed Jan 23, 2024
1 parent 114695a commit 2ddd65d
Show file tree
Hide file tree
Showing 9 changed files with 457 additions and 37 deletions.
27 changes: 27 additions & 0 deletions exercises/bayesian_estimation_basics.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
\section[Bayesian Estimation Basics]{Bayesian Estimation Basics\label{ex:BayesianEstimationBasics}}
Consider a simple univariate model:
\begin{align*}
y_t = \mu + u_t
\end{align*}
with \(t = 1, 2,\ldots , T\) and \(u_t \sim \mathcal{N}(0,\sigma^2)\).
Assume that \(\sigma^2\) is known.
The objective of an econometrician is to estimate \(\mu\).
\begin{enumerate}
\item How do classical and Bayesian analysis differ?
\item Name the key ingredients for Bayesian estimation.
\item What are \enquote{conjugate priors} and \enquote{natural conjugate priors}?
\item What is the idea of Monte Carlo integration in the context of Bayesian estimation?
\end{enumerate}

\paragraph{Readings}
\begin{itemize}
\item \textcite[Part I]{Greenberg_2008_IntroductionBayesianEconometrics}
\item \textcite[Ch.1-2]{Koop_2003_BayesianEconometrics}
\end{itemize}

\begin{solution}\textbf{Solution to \nameref{ex:BayesianEstimationBasics}}
\ifDisplaySolutions
\input{exercises/bayesian_estimation_basics_solution.tex}
\fi
\newpage
\end{solution}
105 changes: 105 additions & 0 deletions exercises/bayesian_estimation_basics_solution.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
\begin{enumerate}
\item In Quantitative Macroeconomics and Econometrics we are concerned with using data to learn about a phenomenon,
e.g.\ the relationship between two macroeconomic variables.
That is: we want to learn about something \emph{unknown} (the parameter \(\mu \)) given something \emph{known} (the data \(y_t\)).
Let's use the sample mean as our estimating function:
\(\hat{\mu}=1/T \sum_{t=1}^T y_t\).
Due to the law of large numbers and the central limit theorem we can derive that
\(\hat{\mu}\sim N(\mu,\frac{\sigma^2}{T})\)
and conduct inference such as computing confidence intervals \([\hat{\mu}\pm 1.96 \frac{\sigma}{\sqrt{T}}]\).
\\
\textbf{Classical/Frequentist approach:} \(\mu \) is a fixed unknown quantity, that is we think there exists a \emph{true value} that is not random.
On the other hand, the estimating function, \(\hat{\mu}\), is a random variable
and is evaluated via repeated sampling.
In a thought experiment, we would be able to generate a large number of datasets (given the true \(\mu \))
and our confidence interval will contain the true value in 95\% of cases.
The estimator \(\hat{\mu}\) is \emph{best} in the sense of having the highest probability of being close to the true \(\mu \).
\\
\textbf{Bayesian approach:} \(\mu \) is treated as a \emph{random variable};
that is, there is NO true unknown value.
Instead our knowledge about the model parameter \(\mu \) is summarized by a \emph{probability distribution}.
In more detail, this distribution summarizes two sources of information:
\begin{enumerate}
\item prior information: subjective beliefs about how likely different parameter values are (information BEFORE seeing the data)
\item sample information: AFTER seeing the data, we update/revise our prior beliefs
\end{enumerate}
In a sense we explicitly make use of (subjective) probabilities to quantify uncertainty about the parameter.

\item The key ingredients are based on the rules of probability, which imply for two events \(A\) and \(B\):
\(p(A,B)=p(A|B)p(B)\), where \(p(A,B)\) is the joint probability of both events happing simultaneously.
\(p(A|B)\) is the probability of \(A\) occurring conditional on \(B\) having occurred;
and \(p(B)\) is the marginal probability of \(B\).
Alternatively, we can reverse \(A\) and \(B\) to get: \(p(A,B)=p(B|A)p(A)\).
Equating the two expressions gives you \textbf{Bayes' rule}:
\begin{align*}
p(B|A) = \frac{p(A|B)p(B)}{p(A)}
\end{align*}
This rule also holds for continuous variables such as parameters \(\theta \) and data \(y\):
\begin{align*}
p(\theta|y) = \frac{p(y|\theta)p(\theta)}{p(y)}
\end{align*}
That is, the key object of interest is the \textbf{posterior} \(p(\theta|y)\) distribution,
which is the product of the \textbf{likelihood function} \(p(y|\theta)\) and the \textbf{prior density} \(p(\theta)\),
divided by the \textbf{marginal data density} \(p(y)\).
In other words, the prior contains our prior (non-data) information,
whereas the likelihood function is the density of the data conditional on the parameters.
Note that the marginal data density \(p(y)\) can be ignored
as it does not depend on the parameters
(it is just a normalization constant as a probability density integrates to one).
Therefore, we can use the proportional \(\propto \) sign, that is the posterior is proportional to the likelihood times the prior:
\begin{align*}
p(\theta|y) \propto p(y|\theta) p(\theta)
\end{align*}
The posterior summarizes all we know about \(\theta \) after seeing the data.
It combines both data and non-data information.
The equation can be viewed as an updating rule,
where data allows us to update our prior views about \(\theta \).

Note that Bayesians are upfront and rigorous about including non-data information!
The idea is that more information (even if subjective) tends to be better than less.

\item In principle any distribution can be combined with the likelihood to form the posterior.
Some priors are, however, more convenient than others to make use of analytical results.
\\
\textbf{Conjugate priors:} If a prior is conjugate, then the posterior has the same density as the prior.
This eases analytical derivations.
\\
\textbf{Natural conjugate priors:} A conjugate prior is called a natural conjugate prior,
if the posterior and the prior have the same functional form as the likelihood function.
That is, the prior can be interpreted as arising from a fictitious dataset from the same data-generating process.

\item The posterior is typically not analytically available and needs to be approximated
unless for special cases using e.g.\ natural conjugate priors.
But, typically we are not interested in the exact shape of the posterior,
but in certain statistics of the posterior distribution such as:
\begin{align*}
E[\theta|y] &= \int_{-\infty}^{\infty} \theta p(\theta|y) d\theta
\\
V[\theta|y] &= \int_{-\infty}^{\infty} \theta^2 p(\theta|y) d\theta - (E(\theta|y))^2
\end{align*}
So we only need to approximate the integrals using numerical methods such as Monte Carlo integration.
That is, IF we had iid draws from the posterior, we can make use of the law of large numbers
and could approximate the posterior mean and variance as:
\begin{align*}
E[\theta|y] &\approx \frac{1}{S} \sum_{i=1}^S \theta_i
\\
V[\theta|y] &\approx \frac{1}{S} \sum_{i=1}^S \theta_i^2 - {\left(\frac{1}{N} \sum_{i=1}^N \theta_i\right)}^2
\end{align*}
Or in general for any function:
\begin{align*}
E[f(\theta)|y] = \int_{-\infty}^{\infty} f(\theta) p(\theta|y) d\theta \approx \frac{1}{S} \sum_{s=1}^S f(\theta_s)
\end{align*}
This is the key idea of Monte Carlo integration,
i.e.\ replace the integral by a sum over \(S\) draws from the posterior.
The Central Limit Theorem can then be used to asses the accuracy of this approximation.
But there are two challenges:
\begin{enumerate}
\item How to draw from the posterior?
\item How to make sure that the draws are iid?
\end{enumerate}
The first question can be answered by using suitable \emph{posterior sampling algorithms}
such as direct sampling, importance sampling, Metropolis-Hastings sampling, Gibbs sampling, or
Sequential Monte-Carlo sampling.
The second question is more difficult to answer and requires some knowledge about the sampling algorithm
and suitable diagnostics.
\end{enumerate}
39 changes: 39 additions & 0 deletions exercises/bayesian_estimation_multivariate_regression.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
\section[Bayesian Estimation of Multivariate Linear Regression Model]{Bayesian Estimation of Multivariate Linear Regression Model\label{ex:BayesianEstimationMultivariateLinearRegressionModel}}
Consider a linear regression model with multiple regressors:
\begin{align*}
Y = X\beta + u
\end{align*}
with \(u \sim \mathcal{N}(0, \sigma^2 I)\).

\begin{enumerate}

\item Name the idea and general procedure for estimating this model with Bayesian methods.

\item Provide an expression for the likelihood function \(p(Y|\beta,\sigma^2)\).

\item Assume that \(\sigma^2\) is known and the prior distribution for \(\beta \)
is Gaussian with mean \(\beta_0\) and covariance matrix \(\Sigma_{0}\).
Derive an expression for the \textbf{conditional posterior distribution} \(p(\beta|\sigma^2,Y)\).

\item Assume that \(\beta \) is known and the prior distribution for the precision \(1/\sigma^2\) is Gamma
with shape parameter \(s_0\) and scale parameter \(v_0\).
Derive an expression for the \textbf{conditional posterior distribution} \(p(1/\sigma^2|\beta,Y)\).

\item Now assume that both \(\beta \) and \(\sigma^2\) are unknown.
Since we are able to draw directly from the \textbf{conditional posterior distributions} (direct sampling),
we can use the Gibbs sampling algorithm to get draws from the \textbf{joint posterior distribution} \(p(\beta,\sigma^2|Y)\).
Provide an overview of the basic steps and algorithm of the Gibbs sampling algorithm.
\end{enumerate}

\paragraph{Readings}
\begin{itemize}
\item \textcite[Ch. 7.1]{Greenberg_2008_IntroductionBayesianEconometrics}
\item \textcite[Ch. 3]{Koop_2003_BayesianEconometrics}
\end{itemize}

\begin{solution}\textbf{Solution to \nameref{ex:BayesianEstimationMultivariateLinearRegressionModel}}
\ifDisplaySolutions
\input{exercises/bayesian_estimation_multivariate_regression_solution.tex}
\fi
\newpage
\end{solution}
119 changes: 119 additions & 0 deletions exercises/bayesian_estimation_multivariate_regression_solution.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
\begin{enumerate}

\item The parameter vector \([\beta,\sigma^2]'\) is a random variable with a probability distribution.
A Bayesian estimation of this distribution combines prior beliefs and information from the data:
\begin{enumerate}
\item Prior distribution \(p(\beta,\sigma^2)\)
\item Likelihood \(p(Y|\beta,\sigma^2)\)
\item Bayes' rule gives the joint posterior distribution
\end{enumerate}
Some useful relationships:
\begin{itemize}
\item joint posterior distribution of \(\beta \) and \(\sigma^2\):
\begin{align*}
p(\beta,\sigma^2|Y) = \frac{p(Y|\beta,\sigma^2) p(\beta,\sigma^2)}{p(Y)} \propto p(Y|\beta,\sigma^2) p(\beta,\sigma^2)
\end{align*}
\item marginal posterior distributions of \(\beta \) and \(\sigma^2\):
\begin{align*}
p(\beta|Y) &= \int_0^\infty p(\beta,\sigma^2|Y) d\sigma^2
\propto \int_0^\infty p(Y|\beta,\sigma^2) p(\beta,\sigma^2) d\sigma^2
\\
p(\sigma^2|Y) &= \int_{-\infty}^{\infty} p(\beta,\sigma^2|Y) d\beta
\propto \int_{-\infty}^{\infty} p(Y|\beta,\sigma^2) p(\beta,\sigma^2) d\beta
\end{align*}
\item conditional posterior distribution of \(\beta \) given \(\sigma^2\):
\begin{align*}
p(\beta|\sigma^2,Y) = \frac{p(\beta,\sigma^2|Y)}{p(\sigma^2|Y)}
\propto p(Y|\beta,\sigma^2) p(\beta,\sigma^2)
\end{align*}
Lastly, the following relationship is useful for simulations:
\begin{align*}
p(\beta,\sigma^2|Y) = p(\beta|\sigma^2,Y) p(\sigma^2|Y)
\end{align*}
\end{itemize}

\item Because \(u\) is normally distributed, \(Y\) is also Gaussian;
hence, we can derive the precise form of the likelihood function:
\begin{align*}
p(Y|\beta,\sigma^2) = \frac{1}{{(2\pi \sigma^2)}^{T/2}} e^{\left \{-\frac{1}{2\sigma^2} (Y-X\beta)'(Y-X\beta)\right \}}
\end{align*}

\item First, assuming that \(\sigma^2\) is known and the prior for \(\beta \) is Gaussian,
we have \(p(\beta) \sim N(\beta_0,\Sigma_0)\);
that is, the prior density is given by:
\begin{align*}
p(\beta) &= {(2\pi)}^{-K/2} |\Sigma_0|^{-1/2} e^{\left \{ -\frac{1}{2} (\beta - \beta_0) \Sigma_0^{-1} (\beta - \beta_0) \right \}}
\end{align*}
Note that the prior for \(\beta \) is independent of \(\sigma^2\),
so we can also write:
\begin{align*}
p(\beta) = p(\beta|\sigma^2)
\end{align*}
Second, conditional on \(\sigma^2\) the likelihood is proportional to:
\begin{align*}
p(Y|\beta,\sigma^2) \propto e^{\left \{-\frac{1}{2\sigma^2} (Y-X\beta)'(Y-X\beta)\right \}}
\end{align*}
Third, combining prior and likelihood yields:
\begin{align*}
p(\beta|\sigma^2,Y) \propto e^{\left \{ -\frac{1}{2} (\beta-\beta_0)' \Sigma_0^{-1} (\beta-\beta_0) - \frac{1}{2\sigma^2} (Y-X\beta)'(Y-X\beta) \right \}}
\end{align*}
One can show (see the references), that this is a Gaussian distribution
\begin{align*}
p(\beta|\sigma^2,Y) \sim N(\beta_1,\Sigma_1)
\end{align*}
with
\begin{align*}
\beta_1 &= \left( \Sigma_0^{-1} + \sigma^{-2} (X'X) \right)^{-1} \left( \Sigma_0^{-1} \beta_0 + \sigma^{-2} (X'Y) \right)
\\
\Sigma_1 &= \left( \Sigma_0^{-1} + \sigma^{-2} (X'X) \right)^{-1}
\end{align*}

\item Assuming that \(\beta \) is known and the prior for \(1/\sigma^2\) is Gamma,
we have \(p(1/\sigma^2) = p(1/\sigma^2|\beta) \sim \Gamma(s_0,v_0)\);
that is, the prior density is given by:
\begin{align*}
p(1/\sigma^2|\beta) & \propto {\left(\frac{1}{\sigma^2} \right)}^{s_0-1} e^{\left \{ -\frac{1}{v_0\sigma^2} \right \}}
\end{align*}
Conditional on \(\beta \) the likelihood is proportional to:
\begin{align*}
p(Y|\sigma^2,\beta) \propto {(\sigma^2)}^{-T/2} e^{\left \{-\frac{1}{2\sigma^2} (Y-X\beta)'(Y-X\beta)\right \}}
\end{align*}
Combining prior and likelihood yields (see the readings for the algebra):
\begin{align*}
p(1/\sigma^2 | \beta, Y) \sim \Gamma(s_1,v_1)
\end{align*}
where
\begin{align*}
s_1 &= s_0 + T
\\
v_1 &= v_0 + (Y-X\beta)'(Y-X\beta)
\end{align*}

\item In the previous exercises we have derived the conditional posteriors in closed-form.
When both \(\beta \) and \(\sigma^2\) are unknown,
we can specify the \textbf{joint prior} distribution for these parameters assuming a Gamma distribution for the marginal prior for \(1/\sigma^2\)
and a normal distribution for the conditional prior for \(\beta|1/\sigma^2\).
That is, the joint prior is then \(p(\beta, 1/\sigma^2) \propto p(\beta|1/\sigma^2) p(1/\sigma^2)\).
It can then be shown that the joint posterior density is:
\begin{align*}
p(\beta,1/\sigma^2|Y) = p(\beta|1/\sigma^2,Y) p(1/\sigma^2|Y)
\end{align*}
To make inference on \(\beta \), we need to know the marginal posterior
\begin{align*}
p(\beta|Y) = \int_0^\infty p(\beta,1/\sigma^2|Y) d(1/\sigma^2)
\end{align*}
This integration is very hard, but we can make use of a numerical Monte Carlo integration approach:
\textbf{Gibbs sampling}.

The idea of Gibbs sampling is to repeatedly sample from the conditional posterior distributions
to get an approximation of the marginal and joint posterior distributions of the parameters.

Basic steps of the Gibbs sampling algorithm:
\begin{itemize}
\item Set priors and initial guess for \(\sigma^2\)
\item Sample \(\beta \) conditional on \(1/\sigma^2\)
\item Sample \(1/\sigma^2\) conditional on \(\beta \)
\item Repeat (2) and (3) a large number of times \(R\) and keep the last \(L\) draws.
\item Use the \(L\) draws to make inference on \(\beta \) and \(\sigma \).
\end{itemize}
\end{enumerate}
60 changes: 60 additions & 0 deletions exercises/bayesian_estimation_quarterly_inflation.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
\section[Bayesian Estimation of Quarterly Inflation]{Bayesian Estimation of Quarterly Inflation\label{ex:BayesianEstimationQuarterlyInflation}}
Perform a Bayesian estimation using the Gibbs sampler of an autoregressive model with two lags of quarterly US inflation
\begin{align*}
y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + u_t = Y_{t-1} \theta + u_t
\end{align*}
where \(Y_{t-1}=(1,y_{t-1},y_{t-2})\), \(u_t\sim WN(0,\sigma_u^2)\)
and \(\theta = (c,\phi_1,\phi_2)'\).
To this end, assume a Gamma distribution for the marginal prior for the precision \(1/\sigma_u^2\)
and a normal distribution for the conditional prior for the coefficients \(\theta \) given \(1/\sigma_u^2\).
\begin{enumerate}
\item Load the dataset \texttt{QuarterlyInflation.csv}.
It contains a series for US quarterly inflation from 1947Q1 to 2012Q3.
Plot the data.
\item Create the matrix of regressors and the corresponding vector of endogenous variables for an AR(2) model with a constant.
\item Set the prior mean for the coefficients to a vector of zeros, \(\theta_0 = 0\),
and the prior covariance matrix to the identity matrix, \(\Sigma_{0}=I\).
\item Set the shape parameter for the variance parameter to \(s_0=1\)
and the scale parameter to \(v_0=0.1\).
\item Set the total number of Gibbs iterations to \(R=50000\) with a burn-in phase of \(B=40000\).
\item Initialize output matrices for the remaining \(R-B\) draws of the coefficient estimates and the variance estimate.
\item Initialize the first draw of \(1/\sigma_u^2\) to its OLS estimate.
\item For \(j=1,\ldots ,R\) do the following
\begin{enumerate}
\item Sample \(\phi(j)\) conditional on \(1/\sigma_u^2(j)\) from \(\mathcal{N}(\theta_1,\Sigma_{1})\) where
\begin{align*}
\Sigma_{1} &= {(\Sigma_{0}^{-1} +\sigma_u^{-2}(j)(X'X))}^{-1}
\\
\theta_1 &= \Sigma_{1} \cdot (\Sigma_{0}^{-1}\phi_0 + \sigma_u^{-2}(j) X'y)
\end{align*}
Optionally: check the stability of the draw to avoid an explosive AR processes.
\item Sample \(1/\sigma_u^2(j)\) conditional on \(\theta(j)\) from the Gamma distribution \(G(s_1,v_1)\)
where
\begin{align*}
s_1 &= s_0 + T
\\
v_1 &= v_0 + \sum_{t=3}^T {(y_t-Y_{t-1}\theta(j))}^2
\end{align*}
\item If you passed the burn-in phase (\(j>B\)),
then save the draws of \(\theta(j)\) and \(\sigma^2(j)\) into the output matrices.
\end{enumerate}
\item Plot the histograms of the draws in your output matrices.
\end{enumerate}

\paragraph{Hints}
\begin{itemize}
\item Use \texttt{mvnrnd(theta1,Sigma1)} to draw from a multivariate normal distribution with mean \(\theta_1\) and covariance matrix \(\Sigma_1\).
\item Use \texttt{gamrnd(s1,1/v1,1,1)} to draw from a Gamma distribution with shape parameter \(s_1\) and scale parameter \(v_1\).
\end{itemize}
\paragraph{Readings}
\begin{itemize}
\item \textcite{Chib.Greenberg_1994_BayesInferenceRegression}
\item \textcite[Ch.~10.1]{Greenberg_2008_IntroductionBayesianEconometrics}
\end{itemize}

\begin{solution}\textbf{Solution to \nameref{ex:BayesianEstimationQuarterlyInflation}}
\ifDisplaySolutions
\input{exercises/bayesian_estimation_quarterly_inflation_solution.tex}
\fi
\newpage
\end{solution}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
\lstinputlisting[style=Matlab-editor,basicstyle=\mlttfamily,title=\lstname]{progs/matlab/BayesianQuarterlyInflation.m}
Loading

0 comments on commit 2ddd65d

Please sign in to comment.