Added material for week 11

wmutschl · Jan 23, 2024 · 2ddd65d · 2ddd65d
1 parent 114695a
commit 2ddd65d
Show file tree

Hide file tree

Showing 9 changed files with 457 additions and 37 deletions.
diff --git a/exercises/bayesian_estimation_basics.tex b/exercises/bayesian_estimation_basics.tex
@@ -0,0 +1,27 @@
+\section[Bayesian Estimation Basics]{Bayesian Estimation Basics\label{ex:BayesianEstimationBasics}}
+Consider a simple univariate model:
+\begin{align*}
+y_t = \mu + u_t
+\end{align*}
+with \(t = 1, 2,\ldots , T\) and \(u_t \sim \mathcal{N}(0,\sigma^2)\).
+Assume that \(\sigma^2\) is known.
+The objective of an econometrician is to estimate \(\mu\).
+\begin{enumerate}
+\item How do classical and Bayesian analysis differ?
+\item Name the key ingredients for Bayesian estimation.
+\item What are \enquote{conjugate priors} and \enquote{natural conjugate priors}?
+\item What is the idea of Monte Carlo integration in the context of Bayesian estimation?
+\end{enumerate}
+
+\paragraph{Readings}
+\begin{itemize}
+\item \textcite[Part I]{Greenberg_2008_IntroductionBayesianEconometrics}
+\item \textcite[Ch.1-2]{Koop_2003_BayesianEconometrics} 
+\end{itemize}
+
+\begin{solution}\textbf{Solution to \nameref{ex:BayesianEstimationBasics}}
+\ifDisplaySolutions
+\input{exercises/bayesian_estimation_basics_solution.tex}
+\fi
+\newpage
+\end{solution}
diff --git a/exercises/bayesian_estimation_basics_solution.tex b/exercises/bayesian_estimation_basics_solution.tex
@@ -0,0 +1,105 @@
+\begin{enumerate}
+\item In Quantitative Macroeconomics and Econometrics we are concerned with using data to learn about a phenomenon,
+  e.g.\ the relationship between two macroeconomic variables.
+That is: we want to learn about something \emph{unknown} (the parameter \(\mu \)) given something \emph{known} (the data \(y_t\)).
+Let's use the sample mean as our estimating function:
+\(\hat{\mu}=1/T \sum_{t=1}^T y_t\).
+Due to the law of large numbers and the central limit theorem we can derive that
+\(\hat{\mu}\sim N(\mu,\frac{\sigma^2}{T})\)
+and conduct inference such as computing confidence intervals \([\hat{\mu}\pm 1.96 \frac{\sigma}{\sqrt{T}}]\).
+\\
+\textbf{Classical/Frequentist approach:} \(\mu \) is a fixed unknown quantity, that is we think there exists a \emph{true value} that is not random.
+On the other hand, the estimating function, \(\hat{\mu}\), is a random variable
+and is evaluated via repeated sampling.
+In a thought experiment, we would be able to generate a large number of datasets (given the true \(\mu \))
+and our confidence interval will contain the true value in 95\% of cases.
+The estimator \(\hat{\mu}\) is \emph{best} in the sense of having the highest probability of being close to the true \(\mu \).
+\\
+\textbf{Bayesian approach:} \(\mu \) is treated as a \emph{random variable};
+that is, there is NO true unknown value.
+Instead our knowledge about the model parameter \(\mu \) is summarized by a \emph{probability distribution}.
+In more detail, this distribution summarizes two sources of information:
+\begin{enumerate}
+	\item prior information: subjective beliefs about how likely different parameter values are (information BEFORE seeing the data)
+	\item sample information: AFTER seeing the data, we update/revise our prior beliefs
+\end{enumerate}
+In a sense we explicitly make use of (subjective) probabilities to quantify uncertainty about the parameter.
+
+\item The key ingredients are based on the rules of probability, which imply for two events \(A\) and \(B\):
+\(p(A,B)=p(A|B)p(B)\), where \(p(A,B)\) is the joint probability of both events happing simultaneously.
+\(p(A|B)\) is the probability of \(A\) occurring conditional on \(B\) having occurred;
+and \(p(B)\) is the marginal probability of \(B\).
+Alternatively, we can reverse \(A\) and \(B\) to get: \(p(A,B)=p(B|A)p(A)\).
+Equating the two expressions gives you \textbf{Bayes' rule}:
+\begin{align*}
+p(B|A) = \frac{p(A|B)p(B)}{p(A)}
+\end{align*}
+This rule also holds for continuous variables such as parameters \(\theta \) and data \(y\):
+\begin{align*}
+p(\theta|y) = \frac{p(y|\theta)p(\theta)}{p(y)}
+\end{align*}
+That is, the key object of interest is the \textbf{posterior} \(p(\theta|y)\) distribution,
+which is the product of the \textbf{likelihood function} \(p(y|\theta)\) and the \textbf{prior density} \(p(\theta)\),
+divided by the \textbf{marginal data density} \(p(y)\).
+In other words, the prior contains our prior (non-data) information,
+  whereas the likelihood function is the density of the data conditional on the parameters.
+Note that the marginal data density \(p(y)\) can be ignored
+as it does not depend on the parameters
+(it is just a normalization constant as a probability density integrates to one).
+Therefore, we can use the proportional \(\propto \) sign, that is the posterior is proportional to the likelihood times the prior:
+\begin{align*}
+p(\theta|y) \propto p(y|\theta) p(\theta)
+\end{align*}
+The posterior summarizes all we know about \(\theta \) after seeing the data.
+It combines both data and non-data information.
+The equation can be viewed as an updating rule,
+where data allows us to update our prior views about \(\theta \).
+
+Note that Bayesians are upfront and rigorous about including non-data information!
+The idea is that more information (even if subjective) tends to be better than less.
+
+\item In principle any distribution can be combined with the likelihood to form the posterior.
+Some priors are, however, more convenient than others to make use of analytical results.
+\\
+\textbf{Conjugate priors:} If a prior is conjugate, then the posterior has the same density as the prior.
+This eases analytical derivations.
+\\
+\textbf{Natural conjugate priors:} A conjugate prior is called a natural conjugate prior,
+if the posterior and the prior have the same functional form as the likelihood function.
+That is, the prior can be interpreted as arising from a fictitious dataset from the same data-generating process.
+
+\item The posterior is typically not analytically available and needs to be approximated
+unless for special cases using e.g.\ natural conjugate priors.
+But, typically we are not interested in the exact shape of the posterior,
+but in certain statistics of the posterior distribution such as:
+\begin{align*}
+E[\theta|y] &= \int_{-\infty}^{\infty} \theta p(\theta|y) d\theta
+\\
+V[\theta|y] &= \int_{-\infty}^{\infty} \theta^2 p(\theta|y) d\theta - (E(\theta|y))^2
+\end{align*}
+So we only need to approximate the integrals using numerical methods such as Monte Carlo integration.
+That is, IF we had iid draws from the posterior, we can make use of the law of large numbers
+and could approximate the posterior mean and variance as:
+\begin{align*}
+E[\theta|y] &\approx \frac{1}{S} \sum_{i=1}^S \theta_i
+\\
+V[\theta|y] &\approx \frac{1}{S} \sum_{i=1}^S \theta_i^2 - {\left(\frac{1}{N} \sum_{i=1}^N \theta_i\right)}^2
+\end{align*}
+Or in general for any function:
+\begin{align*}
+E[f(\theta)|y] = \int_{-\infty}^{\infty} f(\theta) p(\theta|y) d\theta \approx \frac{1}{S} \sum_{s=1}^S f(\theta_s)
+\end{align*}
+This is the key idea of Monte Carlo integration,
+i.e.\ replace the integral by a sum over \(S\) draws from the posterior.
+The Central Limit Theorem can then be used to asses the accuracy of this approximation.
+But there are two challenges:
+\begin{enumerate}
+  \item How to draw from the posterior?
+  \item How to make sure that the draws are iid?
+\end{enumerate}
+The first question can be answered by using suitable \emph{posterior sampling algorithms}
+such as direct sampling, importance sampling, Metropolis-Hastings sampling, Gibbs sampling, or 
+Sequential Monte-Carlo sampling.
+The second question is more difficult to answer and requires some knowledge about the sampling algorithm
+and suitable diagnostics.
+\end{enumerate}
diff --git a/exercises/bayesian_estimation_multivariate_regression.tex b/exercises/bayesian_estimation_multivariate_regression.tex
@@ -0,0 +1,39 @@
+\section[Bayesian Estimation of Multivariate Linear Regression Model]{Bayesian Estimation of Multivariate Linear Regression Model\label{ex:BayesianEstimationMultivariateLinearRegressionModel}}
+Consider a linear regression model with multiple regressors:
+\begin{align*}
+Y = X\beta + u
+\end{align*}
+with \(u \sim \mathcal{N}(0, \sigma^2 I)\).
+
+\begin{enumerate}
+
+\item Name the idea and general procedure for estimating this model with Bayesian methods.
+
+\item Provide an expression for the likelihood function \(p(Y|\beta,\sigma^2)\).
+
+\item Assume that \(\sigma^2\) is known and the prior distribution for \(\beta \)
+is Gaussian with mean \(\beta_0\) and covariance matrix \(\Sigma_{0}\).
+Derive an expression for the \textbf{conditional posterior distribution} \(p(\beta|\sigma^2,Y)\).
+
+\item Assume that \(\beta \) is known and the prior distribution for the precision \(1/\sigma^2\) is Gamma
+with shape parameter \(s_0\) and scale parameter \(v_0\).
+Derive an expression for the \textbf{conditional posterior distribution} \(p(1/\sigma^2|\beta,Y)\).
+
+\item Now assume that both \(\beta \) and \(\sigma^2\) are unknown.
+Since we are able to draw directly from the \textbf{conditional posterior distributions} (direct sampling),
+  we can use the Gibbs sampling algorithm to get draws from the \textbf{joint posterior distribution} \(p(\beta,\sigma^2|Y)\).
+Provide an overview of the basic steps and algorithm of the Gibbs sampling algorithm.
+\end{enumerate}
+
+\paragraph{Readings}
+\begin{itemize}
+	\item \textcite[Ch. 7.1]{Greenberg_2008_IntroductionBayesianEconometrics}
+	\item \textcite[Ch. 3]{Koop_2003_BayesianEconometrics} 
+\end{itemize}
+
+\begin{solution}\textbf{Solution to \nameref{ex:BayesianEstimationMultivariateLinearRegressionModel}}
+\ifDisplaySolutions
+\input{exercises/bayesian_estimation_multivariate_regression_solution.tex}
+\fi
+\newpage
+\end{solution}
diff --git a/exercises/bayesian_estimation_multivariate_regression_solution.tex b/exercises/bayesian_estimation_multivariate_regression_solution.tex
@@ -0,0 +1,119 @@
+\begin{enumerate}
+
+\item The parameter vector \([\beta,\sigma^2]'\) is a random variable with a probability distribution.
+A Bayesian estimation of this distribution combines prior beliefs and information from the data:
+\begin{enumerate}
+	\item Prior distribution \(p(\beta,\sigma^2)\)
+	\item Likelihood \(p(Y|\beta,\sigma^2)\)
+	\item Bayes' rule gives the joint posterior distribution
+\end{enumerate}
+Some useful relationships:
+\begin{itemize}
+\item joint posterior distribution of \(\beta \) and \(\sigma^2\):
+\begin{align*}
+p(\beta,\sigma^2|Y) = \frac{p(Y|\beta,\sigma^2) p(\beta,\sigma^2)}{p(Y)} \propto p(Y|\beta,\sigma^2) p(\beta,\sigma^2)
+\end{align*}
+\item marginal posterior distributions of \(\beta \) and \(\sigma^2\):
+\begin{align*}
+p(\beta|Y) &= \int_0^\infty p(\beta,\sigma^2|Y) d\sigma^2
+\propto \int_0^\infty p(Y|\beta,\sigma^2) p(\beta,\sigma^2) d\sigma^2
+\\
+p(\sigma^2|Y) &= \int_{-\infty}^{\infty} p(\beta,\sigma^2|Y) d\beta
+\propto \int_{-\infty}^{\infty} p(Y|\beta,\sigma^2) p(\beta,\sigma^2) d\beta
+\end{align*}
+\item conditional posterior distribution of \(\beta \) given \(\sigma^2\):
+\begin{align*}
+p(\beta|\sigma^2,Y) = \frac{p(\beta,\sigma^2|Y)}{p(\sigma^2|Y)}
+\propto p(Y|\beta,\sigma^2) p(\beta,\sigma^2)
+\end{align*}
+Lastly, the following relationship is useful for simulations:
+\begin{align*}
+p(\beta,\sigma^2|Y) = p(\beta|\sigma^2,Y) p(\sigma^2|Y)
+\end{align*}
+\end{itemize}
+
+\item Because \(u\) is normally distributed, \(Y\) is also Gaussian;
+hence, we can derive the precise form of the likelihood function:
+\begin{align*}
+p(Y|\beta,\sigma^2) = \frac{1}{{(2\pi \sigma^2)}^{T/2}} e^{\left \{-\frac{1}{2\sigma^2} (Y-X\beta)'(Y-X\beta)\right \}}
+\end{align*}
+
+\item First, assuming that \(\sigma^2\) is known and the prior for \(\beta \) is Gaussian,
+we have \(p(\beta) \sim N(\beta_0,\Sigma_0)\);
+that is, the prior density is given by:
+\begin{align*} 
+p(\beta) &= {(2\pi)}^{-K/2} |\Sigma_0|^{-1/2} e^{\left \{ -\frac{1}{2} (\beta - \beta_0) \Sigma_0^{-1} (\beta - \beta_0) \right \}}
+\end{align*}
+Note that the prior for \(\beta \) is independent of \(\sigma^2\),
+so we can also write:
+\begin{align*}
+p(\beta) = p(\beta|\sigma^2)
+\end{align*}
+Second, conditional on \(\sigma^2\) the likelihood is proportional to:
+\begin{align*}
+p(Y|\beta,\sigma^2) \propto e^{\left \{-\frac{1}{2\sigma^2} (Y-X\beta)'(Y-X\beta)\right \}}
+\end{align*}
+Third, combining prior and likelihood yields:
+\begin{align*}
+p(\beta|\sigma^2,Y) \propto e^{\left \{ -\frac{1}{2} (\beta-\beta_0)' \Sigma_0^{-1} (\beta-\beta_0) - \frac{1}{2\sigma^2} (Y-X\beta)'(Y-X\beta) \right \}}
+\end{align*}
+One can show (see the references), that this is a Gaussian distribution
+\begin{align*}
+p(\beta|\sigma^2,Y) \sim N(\beta_1,\Sigma_1)
+\end{align*}
+with
+\begin{align*}
+\beta_1 &= \left( \Sigma_0^{-1} + \sigma^{-2} (X'X) \right)^{-1} \left( \Sigma_0^{-1} \beta_0 + \sigma^{-2} (X'Y) \right)
+\\
+\Sigma_1 &= \left( \Sigma_0^{-1} + \sigma^{-2} (X'X) \right)^{-1}
+\end{align*}
+
+\item Assuming that \(\beta \) is known and the prior for \(1/\sigma^2\) is Gamma,
+we have \(p(1/\sigma^2) = p(1/\sigma^2|\beta) \sim \Gamma(s_0,v_0)\);
+that is, the prior density is given by:
+\begin{align*}
+	p(1/\sigma^2|\beta) & \propto {\left(\frac{1}{\sigma^2} \right)}^{s_0-1} e^{\left \{ -\frac{1}{v_0\sigma^2} \right \}}
+\end{align*}
+Conditional on \(\beta \) the likelihood is proportional to:
+\begin{align*}
+p(Y|\sigma^2,\beta) \propto {(\sigma^2)}^{-T/2} e^{\left \{-\frac{1}{2\sigma^2} (Y-X\beta)'(Y-X\beta)\right \}}
+\end{align*}
+Combining prior and likelihood yields (see the readings for the algebra):
+\begin{align*}
+p(1/\sigma^2 | \beta, Y) \sim \Gamma(s_1,v_1)
+\end{align*}
+where
+\begin{align*}
+s_1 &= s_0 + T
+\\
+v_1 &= v_0 + (Y-X\beta)'(Y-X\beta)
+\end{align*}
+
+\item In the previous exercises we have derived the conditional posteriors in closed-form.
+When both \(\beta \) and \(\sigma^2\) are unknown,
+  we can specify the \textbf{joint prior} distribution for these parameters assuming a Gamma distribution for the marginal prior for \(1/\sigma^2\)
+  and a normal distribution for the conditional prior for \(\beta|1/\sigma^2\).
+That is, the joint prior is then \(p(\beta, 1/\sigma^2) \propto p(\beta|1/\sigma^2) p(1/\sigma^2)\).
+It can then be shown that the joint posterior density is:
+\begin{align*}
+p(\beta,1/\sigma^2|Y) = p(\beta|1/\sigma^2,Y) p(1/\sigma^2|Y)
+\end{align*}
+To make inference on \(\beta \), we need to know the marginal posterior
+\begin{align*}
+p(\beta|Y) = \int_0^\infty p(\beta,1/\sigma^2|Y) d(1/\sigma^2)
+\end{align*}
+This integration is very hard, but we can make use of a numerical Monte Carlo integration approach:
+\textbf{Gibbs sampling}.
+
+The idea of Gibbs sampling is to repeatedly sample from the conditional posterior distributions
+to get an approximation of the marginal and joint posterior distributions of the parameters.
+
+Basic steps of the Gibbs sampling algorithm:
+\begin{itemize}
+	\item Set priors and initial guess for \(\sigma^2\)
+	\item Sample \(\beta \) conditional on \(1/\sigma^2\)
+	\item Sample \(1/\sigma^2\) conditional on \(\beta \)
+	\item Repeat (2) and (3) a large number of times \(R\) and keep the last \(L\) draws.
+	\item Use the \(L\) draws to make inference on \(\beta \) and \(\sigma \).
+\end{itemize}
+\end{enumerate}
diff --git a/exercises/bayesian_estimation_quarterly_inflation.tex b/exercises/bayesian_estimation_quarterly_inflation.tex
@@ -0,0 +1,60 @@
+\section[Bayesian Estimation of Quarterly Inflation]{Bayesian Estimation of Quarterly Inflation\label{ex:BayesianEstimationQuarterlyInflation}}
+Perform a Bayesian estimation using the Gibbs sampler of an autoregressive model with two lags of quarterly US inflation
+\begin{align*}
+y_t = c + \phi_1 y_{t-1} + \phi_2 y_{t-2} + u_t = Y_{t-1} \theta + u_t
+\end{align*}
+where \(Y_{t-1}=(1,y_{t-1},y_{t-2})\), \(u_t\sim WN(0,\sigma_u^2)\)
+and \(\theta = (c,\phi_1,\phi_2)'\).
+To this end, assume a Gamma distribution for the marginal prior for the precision \(1/\sigma_u^2\)
+  and a normal distribution for the conditional prior for the coefficients \(\theta \) given \(1/\sigma_u^2\).
+\begin{enumerate}
+\item Load the dataset \texttt{QuarterlyInflation.csv}.
+It contains a series for US quarterly inflation from 1947Q1 to 2012Q3.
+Plot the data.
+\item Create the matrix of regressors and the corresponding vector of endogenous variables for an AR(2) model with a constant.
+\item Set the prior mean for the coefficients to a vector of zeros, \(\theta_0 = 0\),
+and the prior covariance matrix to the identity matrix, \(\Sigma_{0}=I\).
+\item Set the shape parameter for the variance parameter to \(s_0=1\)
+and the scale parameter to \(v_0=0.1\).
+\item Set the total number of Gibbs iterations to \(R=50000\) with a burn-in phase of \(B=40000\).
+\item Initialize output matrices for the remaining \(R-B\) draws of the coefficient estimates and the variance estimate.
+\item Initialize the first draw of \(1/\sigma_u^2\) to its OLS estimate.
+\item For \(j=1,\ldots ,R\) do the following
+\begin{enumerate}	
+	\item Sample \(\phi(j)\) conditional on \(1/\sigma_u^2(j)\) from \(\mathcal{N}(\theta_1,\Sigma_{1})\) where
+	\begin{align*}
+	\Sigma_{1} &= {(\Sigma_{0}^{-1} +\sigma_u^{-2}(j)(X'X))}^{-1}
+    \\
+    \theta_1 &= \Sigma_{1} \cdot (\Sigma_{0}^{-1}\phi_0 + \sigma_u^{-2}(j) X'y)
+	\end{align*}
+	Optionally: check the stability of the draw to avoid an explosive AR processes.
+	\item Sample \(1/\sigma_u^2(j)\) conditional on \(\theta(j)\) from the Gamma distribution \(G(s_1,v_1)\)
+	where
+	\begin{align*}
+	s_1 &= s_0 + T
+	\\
+	v_1 &= v_0 + \sum_{t=3}^T {(y_t-Y_{t-1}\theta(j))}^2
+	\end{align*}
+	\item If you passed the burn-in phase (\(j>B\)),
+	then save the draws of \(\theta(j)\) and \(\sigma^2(j)\) into the output matrices.
+\end{enumerate}
+\item Plot the histograms of the draws in your output matrices.
+\end{enumerate}
+
+\paragraph{Hints}
+\begin{itemize}
+	\item Use \texttt{mvnrnd(theta1,Sigma1)} to draw from a multivariate normal distribution with mean \(\theta_1\) and covariance matrix \(\Sigma_1\).
+	\item Use \texttt{gamrnd(s1,1/v1,1,1)} to draw from a Gamma distribution with shape parameter \(s_1\) and scale parameter \(v_1\).
+\end{itemize}
+\paragraph{Readings}
+\begin{itemize}
+	\item \textcite{Chib.Greenberg_1994_BayesInferenceRegression}
+	\item \textcite[Ch.~10.1]{Greenberg_2008_IntroductionBayesianEconometrics}
+\end{itemize}
+
+\begin{solution}\textbf{Solution to \nameref{ex:BayesianEstimationQuarterlyInflation}}
+\ifDisplaySolutions
+\input{exercises/bayesian_estimation_quarterly_inflation_solution.tex}
+\fi
+\newpage
+\end{solution}
diff --git a/exercises/bayesian_estimation_quarterly_inflation_solution.tex b/exercises/bayesian_estimation_quarterly_inflation_solution.tex
@@ -0,0 +1 @@
+\lstinputlisting[style=Matlab-editor,basicstyle=\mlttfamily,title=\lstname]{progs/matlab/BayesianQuarterlyInflation.m}
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		\lstinputlisting[style=Matlab-editor,basicstyle=\mlttfamily,title=\lstname]{progs/matlab/BayesianQuarterlyInflation.m}