-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contour plot levels corresponding to probability or std #3
Comments
Is this issue still open? |
It was implemented in 79f5d02. I think we should add a mechanism to determine a reasonable number of samples dependent on the requested quantiles. |
I agree for general distributions. For normal distributions, which are the most common case, one could discuss avoiding sampling and using the cdf directly. |
Currently the levels of the contour plot that show pdfs are automatically chosen, but it is unclear what the levels correspond to in terms of probability.
I suggest that the isovalues (densities) of the isolines should be selected so they show highest density regions (see https://doi.org/10.2307/2684423). Then areas like "75% of samples fall into this region" can be shown.
Here's a 1D illustration for highest density regions (HDR) that I stole from some stackoverflow thread.
.
To find the corresponding density values, a monte carlo approach was proposed here, which does the following:
X
with pdff(x)
S
↤ drawn
samples fromX
D
↤f(S)
(compute densities for the samples),D
in descending order so thatDi
is the i-th largest densityDi
as an approximation for the isovalue corresponding toi/n
-quantile ofY = f(X)
So for a region of "75% of samples are in here" we use
Dj
withj=int(0.75*n)
I'm proposing this approach because it works with any distribution that allows for sampling and has a pdf (we need a pdf anyway to draw contours). However, it requires to draw many samples to get good estimates. For example 99.7% (=3 standard deviation radius of normal distribution) needs at least 1000 samples to give a bad estimate, rather 10.000 for an okay-ish estimate. I'm not sure if this is the best we can do, but its a simple algorithm.
The text was updated successfully, but these errors were encountered: