Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add uncertainty to distribution parameters #58

Open
joshwlambert opened this issue Nov 8, 2022 · 2 comments
Open

Add uncertainty to distribution parameters #58

joshwlambert opened this issue Nov 8, 2022 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@joshwlambert
Copy link
Member

The parameters currently stored in the epiparameter library are point estimates without any uncertainty. I propose we add the boot R package as a dependency to calculate the confidence intervals of delay distributions parameters and summary statistics. This can be accomplished using the boot.ci() function in the boot package. This should be flexible to the CI calculate (given inherent flexibility of bootstrapping) and can calculate parametric and nonparametric bootstrap CIs which should accomodate either when we have the parameters and distribution reported (parametric) or the raw data to estimate the parameters (nonparametric).

Other packages that could be used are bootstrap.

Alternatively, the CIs could be calculated using a different method (normal approximation).

@joshwlambert joshwlambert added enhancement New feature or request help wanted Extra attention is needed labels Nov 8, 2022
@sbfnk
Copy link
Contributor

sbfnk commented Nov 10, 2022

I agree that measures of uncertainty would be important to add. Would your idea be to 1) store the raw data and construct summary statistics and CIs, or 2) to take them directly from the cited sources? For option (1) I think at the least we'd need to store the sample size alongside the estimates.

@joshwlambert
Copy link
Member Author

For now we only have the parameters or summary statistics reported in the papers and do not have any raw data to infer the parameters or CIs ourselves. Usually papers report the sample size which is indeed needed for bootstrapping the CIs.

I've added a basic function that calculates the confidence intervals of a summary statistic (https://github.com/epiverse-trace/epiparameter/blob/feature/ci/R/calc_ci.R). However, there is a new issue (I can open a new issue if that helps organise things).

The distribution assumed in the function is either a gamma, lognormal or weibull. However, in many studies reporting summary statistics or distribution parameters the distribution is right-truncated. This paper give a clear example of the differences between the standard distribution and the truncated form. Using the bootstrapping function to calculate the CIs for these values will be biased as it lacks the right truncation. Any ideas on how to solve this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants