Skip to content

Commit

Permalink
Merge pull request #22 from smotherh/development
Browse files Browse the repository at this point in the history
renamed evidence estimation code, added explanation of why the code w…
  • Loading branch information
ivezic authored Mar 16, 2018
2 parents 88c1b88 + e8be67b commit 895cd71
Show file tree
Hide file tree
Showing 16 changed files with 2,347 additions and 10 deletions.
49 changes: 39 additions & 10 deletions homeworks/group2/HW_3/HW3_group2.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -99,10 +99,15 @@
" logp[i] = -M.func(traces[:, i])\n",
" return logp\n",
"\n",
"# This function estimates the bayes factor given an MCMC chain.\n",
"# This function estimates the log of evidence given an MCMC chain.\n",
"# This is used to calculate the odds ratio between two models\n",
"def estimate_bayes_factor(traces, logp, r=0.02, return_list=False):\n",
" \"\"\"Estimate the bayes factor using the local density of points\"\"\"\n",
"def estimate_log_evidence(traces, logp, r=0.02, return_list=False):\n",
" \"\"\"Estimate the log of the evidence, \n",
" using the local density of points. \n",
" The code is borrowed from the AstroML source\n",
" code of Fig.5.24, which in turn is based on \n",
" eq. 5.127 in Ivezic+2014\n",
" \"\"\"\n",
" D, N = traces.shape\n",
"\n",
" # compute volume of a D-dimensional sphere of radius r\n",
Expand All @@ -112,15 +117,36 @@
" bt = BallTree(traces.T)\n",
" count = bt.query_radius(traces.T, r=r, count_only=True)\n",
"\n",
" BF = logp + np.log(N) + np.log(Vr) - np.log(count)\n",
" logE= logp + np.log(N) + np.log(Vr) - np.log(count)\n",
"\n",
" if return_list:\n",
" return BF\n",
" return logE\n",
" else:\n",
" p25, p50, p75 = np.percentile(BF, [25, 50, 75])\n",
" p25, p50, p75 = np.percentile(logE, [25, 50, 75])\n",
" return p50, 0.7413 * (p75 - p25)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Explanation : why the above estimation of the evidence works.**\n",
" \n",
"We choose between two models M1 and M2 by sampling the posterior space with the MCMC method. The traces returned by MCMC provide an optimal sampling of the posterior space for parameters **$\\theta$** (in our case, each model is parametrized with a set of four parameters : { $b_{0}$, $A$, $T$, $\\sigma$} for Gaussian model M1, and { $b_{0}$, $A$, $T$, $\\alpha$} for Exponential decay model M2. As eqs. 5.124-5.127 in Ivezic+2014 show, the Bayesian evidence can be estimated by \n",
"\n",
"$ \\mathrm{evidence} \\equiv L(M) = \\frac{N p(\\theta)}{\\rho(\\theta)}$.\n",
"\n",
"Taking the logs : \n",
"\n",
"$\\log(L) = \\log(N p(\\theta) / \\rho(\\theta) ) = \\log(N p ) - \\log(\\rho) $\n",
"\n",
"Now $\\rho = \\mathrm{counts} / \\mathrm{volume}$ , both of which can be estimated using the BallTree algorithm that counts the number of samples in the hypersphere of a given volume. The radius is problem-dependent, but in this case we deemed $r=0.02$ to be sufficient. Thus : \n",
"\n",
"$\\log(L) = \\log(N) + \\log(p) - \\log(\\mathrm{counts}) + \\log(\\mathrm{volume})$.\n",
"\n",
"Other ways to estimate the posterior density in the multidimensional parameter space include kernel density estimation methods ( see Ivezic+2014, Sec. 6.1.1). There is more literature on the problem of estimating the density of the posterior sampled with the MCMC method - see [Morey+2011](https://www.sciencedirect.com/science/article/pii/S0022249611000666), [Sharma+2017](https://arxiv.org/pdf/1706.01629.pdf), [Weinberg, M.D. 2010](https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1107&context=astro_faculty_pubs). "
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -131,7 +157,9 @@
{
"cell_type": "code",
"execution_count": 62,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def Run_Gaussian_Exponential_Comparison(N,b0_true=10,A_true=3.0,sigma_true=3.0,T_true=40):\n",
Expand Down Expand Up @@ -202,7 +230,7 @@
" # Save the traces and best-fit values for the Gaussian profile\n",
" Gaussian_traces, Gaussian_fit_vals, Gaussian_logp = compute_Gaussian_MCMC_results()\n",
" \n",
" Gaussian_Bayes_Factor, dGBF = estimate_bayes_factor(np.array(Gaussian_traces), Gaussian_logp, r=0.05)\n",
" Gaussian_Bayes_Factor, dGBF = estimate_log_evidence(np.array(Gaussian_traces), Gaussian_logp, r=0.05)\n",
"\n",
" \n",
" #========================================================================================\n",
Expand Down Expand Up @@ -247,7 +275,8 @@
" # Save the traces and best-fit values for the exponential profile\n",
" Exponential_traces, Exponential_fit_vals, Exponential_logp = compute_Exponential_MCMC_results()\n",
" \n",
" Exponential_Bayes_Factor, dEBF = estimate_bayes_factor(np.array(Exponential_traces), Exponential_logp,r=0.05)\n",
" Exponential_Bayes_Factor, dEBF = estimate_log_evidence(np.array(Exponential_traces), \n",
" Exponential_logp,r=0.05)\n",
"\n",
" # Now we return all values\n",
" return(t_obs,y_obs,err_y,Gaussian_traces,\n",
Expand Down Expand Up @@ -695,7 +724,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit 895cd71

Please sign in to comment.