diff --git a/config.yaml b/config.yaml index cab81e68..615e9bfb 100644 --- a/config.yaml +++ b/config.yaml @@ -64,6 +64,7 @@ episodes: #- describe-cases.Rmd #- simple-analysis.Rmd - delays-reuse.Rmd +- quantify-transmissibility.Rmd - delays-functions.Rmd # Information for Learners diff --git a/delays-functions.md b/delays-functions.md index 2e4e5143..f021a8b7 100644 --- a/delays-functions.md +++ b/delays-functions.md @@ -1,5 +1,5 @@ --- -title: 'Use delay distributions in analysis' +title: 'Input delay data' teaching: 10 exercises: 2 editor_options: @@ -63,19 +63,44 @@ covid_serialint <- ) ``` -Now, we have an epidemiological parameter we can use in our analysis! In the chunk below we replaced one of the **summary statistics** inputs into `EpiNow2::dist_spec()` +```{.output} +Using Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel +coronavirus (COVID-19) infections." _International Journal of +Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 +.. +To retrieve the short citation use the 'get_citation' function +``` ```r -generation_time <- - EpiNow2::dist_spec( - mean = covid_serialint$summary_stats$mean, # we changed this line :) - sd = 2, - max = 20, - distribution = "gamma" - ) +covid_serialint ``` -In this episode, we will use the **distribution functions** that `{epiparameter}` provides to get a maximum value (`max`) for this and any other package downstream in your analysis pipeline! +```{.output} +Disease: COVID-19 +Pathogen: SARS-CoV-2 +Epi Distribution: serial interval +Study: Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel +coronavirus (COVID-19) infections." _International Journal of +Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 +. +Distribution: lnorm +Parameters: + meanlog: 1.386 + sdlog: 0.568 +``` + +Now, we have an epidemiological parameter we can reuse! We can replace the two out of three **summary statistics** into `EpiNow2::dist_spec()` + +```r +generation_time <- dist_spec( + mean = covid_serialint$summary_stats$mean, + sd = covid_serialint$summary_stats$sd, + max = 20, + distribution = "gamma" +) +``` + +In this episode, we will use the **distribution functions** that `{epiparameter}` provides to get a `max` value for this and any other package downstream in the pipeline! Let's load the `{epiparameter}` and `{EpiNow2}` package. For `{EpiNow2}`, we'll set 4 cores to be used in parallel computations. We'll use the pipe `%>%`, some `{dplyr}` verbs and `{ggplot2}`, so let's also call to the `{tidyverse}` package: @@ -150,8 +175,8 @@ generate(covid_serialint, times = 10) ``` ```{.output} - [1] 4.436411 4.079876 7.023633 16.692691 4.443053 2.929674 5.768319 - [8] 4.648662 6.260220 6.995987 + [1] 4.946173 1.848653 4.558656 4.051608 8.892126 3.296900 4.339645 5.280796 + [9] 6.350389 5.440831 ``` ::::::::: instructor @@ -268,7 +293,7 @@ Parameters: sdlog: 0.568 ``` -We identify this change in the `Distribution:` output line of the `` object. Double check this line: +We identify this change in the `Distribution:` output line of the `` object. Take a double check to this line: ``` Distribution: discrete lnorm @@ -410,24 +435,40 @@ quantile(covid_serialint_discrete, p = 0.999) %>% ::::::::::::::::::::::::::::::::::::::::::: +:::::::::::::::::::::::::::::: callout -## Plug-in `{epiparameter}` to `{EpiNow2}` - -Now we can plug everything into the `EpiNow2::dist_spec()` function! - -- the **summary statistics** `mean` and `sd` of the distribution, -- a maximum value `max`, -- the `distribution` name. +### Log normal distributions -But, before, in `EpiNow2::dist_spec()` for a **Lognormal** distribution we need the *distribution parameters* instead of the summary statistics: +If you need the log normal **distribution parameters** instead of the summary statistics, we can use `epiparameter::get_parameters()`: ```r covid_serialint_parameters <- epiparameter::get_parameters(covid_serialint) + +covid_serialint_parameters +``` + +```{.output} + meanlog sdlog +1.3862617 0.5679803 ``` -Then, we have: +This gets a vector of class `` ready to use as input for any other package! + +**BONUS TIP:** If we write the `[]` next to the last object create like in `covid_serialint_parameters[]`, within `[]` we can use the +Tab key +to use the [code completion feature](https://support.posit.co/hc/en-us/articles/205273297-Code-Completion-in-the-RStudio-IDE) and have a quick access to `covid_serialint_parameters["meanlog"]` and `covid_serialint_parameters["sdlog"]`. We invite you to try this out in code chunks and the R console! + +:::::::::::::::::::::::::::::: + +## Plug-in `{epiparameter}` to `{EpiNow2}` + +Now we can plug everything into the `EpiNow2::dist_spec()` function! + +- the **summary statistics** `mean` and `sd` of the distribution, +- a maximum value `max`, +- the `distribution` name. ```r @@ -447,53 +488,119 @@ serial_interval_covid Fixed distribution with PMF [0.0073 0.1 0.2 0.19 0.15 0.11 0.075 0.051 0.035 0.023 0.016 0.011 0.0076 0.0053 0.0037 0.0027 0.0019 0.0014 0.001 0.00074 0.00055 0.00041 0.00031] ``` -:::::::::::::::::::::::::::::: callout - -### A code completion tip - -If we write the `[]` next to the object `covid_serialint_parameters[]`, within `[]` we can use the -Tab key -for [code completion feature](https://support.posit.co/hc/en-us/articles/205273297-Code-Completion-in-the-RStudio-IDE) +:::::::::: callout -This gives quick access to `covid_serialint_parameters["meanlog"]` and `covid_serialint_parameters["sdlog"]`. +### Warning -We invite you to try this out in code chunks and the R console! +Using the serial interval instead of the generation time is an alternative that can propagate bias in your estimates, even more so in diseases with reported pre-symptomatic transmission. ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)) -:::::::::::::::::::::::::::::: +:::::::::::::::::: Let's replace the `generation_time` input we used for `EpiNow2::epinow()`. ```r -epinow_estimates_cg <- epinow( +epinow_estimates <- epinow( # cases reported_cases = example_confirmed[1:60], # delays generation_time = generation_time_opts(serial_interval_covid) ) + +base::plot(epinow_estimates) +``` + +::::::::::::::::::::::::::::::::: challenge + +### Ebola's effective reproduction number + +Download and read the [Ebola dataset](data/ebola_cases.csv): + +- Reuse one epidemiological parameter to estimate the effective reproduction number for the Ebola dataset. +- Why did you choose that parameter? + +::::::::::::::::: hint + +To calculate the $R_t$, we need: + +- data set with confirmed cases per day and +- one key delay distribution + +Key functions we applied in this episode are: + +- `epidist_db()` +- `list_distributions()` +- `discretise()` +- probability functions for continuous and discrete distributions + +:::::::::::::::::::::: + +::::::::::::::::: solution + + + + +```r +# read data +# e.g.: if path to file is data/raw-data/ebola_cases.csv then: +ebola_confirmed <- + read_csv(here::here("data", "raw-data", "ebola_cases.csv")) + +# list distributions +epidist_db(disease = "ebola") %>% + list_distributions() +``` + + +```r +# subset one distribution +ebola_serial <- epidist_db( + disease = "ebola", + epi_dist = "serial", + single_epidist = TRUE +) + +# adapt epiparameter to epinow2 +ebola_serial_discrete <- discretise(ebola_serial) + +ebola_serial_discrete_max <- quantile(ebola_serial_discrete, p = 0.999) + +serial_interval_ebola <- + dist_spec( + mean = ebola_serial$summary_stats$mean, + sd = ebola_serial$summary_stats$sd, + max = ebola_serial_discrete_max, + distribution = "gamma" # don't forget! it's a must! + ) + +# run epinow +epinow_estimates <- epinow( + # cases + reported_cases = ebola_confirmed, + # delays + generation_time = generation_time_opts(serial_interval_ebola) +) ``` ```{.output} -WARN [2024-04-02 21:04:37] epinow: There were 3 divergent transitions after warmup. See +WARN [2024-03-28 20:46:51] epinow: There were 8 divergent transitions after warmup. See https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup to find out why this is a problem and how to eliminate them. - -WARN [2024-04-02 21:04:37] epinow: Examine the pairs() plot to diagnose sampling problems +WARN [2024-03-28 20:46:51] epinow: Examine the pairs() plot to diagnose sampling problems - ``` ```r -base::plot(epinow_estimates_cg) +plot(epinow_estimates) ``` - + -:::::::::: callout - -### Warning +`{EpiNow2}` can also include the uncertainty around each summary statistic. We invite you to read this discussion on: [How to adapt `{epiparameter}` uncertainty entries to `{EpiNow2}`](https://github.com/epiverse-trace/epiparameter/discussions/218)? -Using the serial interval instead of the generation time is an alternative that can propagate bias in your estimates, even more so in diseases with reported pre-symptomatic transmission. ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)) +:::::::::::::::::::::::::: -:::::::::::::::::: +::::::::::::::::::::::::::::::::::::::::::: ## Adjusting for reporting delays @@ -536,8 +643,6 @@ epinow_estimates <- epinow( ```r -# generation time --------------------------------------------------------- - # get covid serial interval covid_serialint <- epiparameter::epidist_db( @@ -546,7 +651,17 @@ covid_serialint <- author = "Nishiura", single_epidist = TRUE ) +``` + +```{.output} +Using Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel +coronavirus (COVID-19) infections." _International Journal of +Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 +.. +To retrieve the short citation use the 'get_citation' function +``` +```r # adapt epidist to epinow2 covid_serialint_discrete_max <- covid_serialint %>% @@ -564,8 +679,6 @@ covid_serial_interval <- distribution = "lognormal" ) -# incubation time --------------------------------------------------------- - # get covid incubation period covid_incubation <- epiparameter::epidist_db( disease = "covid", @@ -573,7 +686,19 @@ covid_incubation <- epiparameter::epidist_db( author = "Natalie", single_epidist = TRUE ) +``` +```{.output} +Using Linton N, Kobayashi T, Yang Y, Hayashi K, Akhmetzhanov A, Jung S, Yuan +B, Kinoshita R, Nishiura H (2020). "Incubation Period and Other +Epidemiological Characteristics of 2019 Novel Coronavirus Infections +with Right Truncation: A Statistical Analysis of Publicly Available +Case Data." _Journal of Clinical Medicine_. doi:10.3390/jcm9020538 +.. +To retrieve the short citation use the 'get_citation' function +``` + +```r # adapt epiparameter to epinow2 covid_incubation_discrete_max <- covid_incubation %>% @@ -591,10 +716,8 @@ covid_incubation_time <- distribution = "lognormal" # do not forget this! ) -# epinow ------------------------------------------------------------------ - # run epinow -epinow_estimates_cgi <- epinow( +epinow_estimates <- epinow( # cases reported_cases = example_confirmed[1:60], # delays @@ -604,66 +727,68 @@ epinow_estimates_cgi <- epinow( ``` ```{.output} -WARN [2024-04-02 21:06:28] epinow: There were 9 divergent transitions after warmup. See +Logging threshold set at INFO for the EpiNow2 logger +``` + +```{.output} +Writing EpiNow2 logs to the console and: /tmp/Rtmp0E3eaa/regional-epinow/2020-04-21.log +``` + +```{.output} +Logging threshold set at INFO for the EpiNow2.epinow logger +``` + +```{.output} +Writing EpiNow2.epinow logs to the console and: /tmp/Rtmp0E3eaa/epinow/2020-04-21.log +``` + +```{.output} +WARN [2024-03-28 20:48:37] epinow: There were 8 divergent transitions after warmup. See https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup to find out why this is a problem and how to eliminate them. - -WARN [2024-04-02 21:06:28] epinow: Examine the pairs() plot to diagnose sampling problems +WARN [2024-03-28 20:48:37] epinow: Examine the pairs() plot to diagnose sampling problems - -WARN [2024-04-02 21:06:29] epinow: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable. -Running the chains for more iterations may help. See -https://mc-stan.org/misc/warnings.html#bulk-ess - ``` ```r -base::plot(epinow_estimates_cgi) +base::plot(epinow_estimates) ``` - + :::::::::::::::::::::::::: -::::::::::::::::::::::::::::::::::::::::::: - -:::::::::::::::::::::::::::::::::::::::::::::::::::::::: discussion +:::::::::::::: solution ### How much has it changed? After adding the incubation period, discuss: -- Does the trend of the model fit in the "Estimate" section change? +- Does the retrospective trend of forecast change? - Has the uncertainty changed? - How would you explain or interpret any of these changes? -Compare the `{EpiNow2}` figures generated previously. +:::::::::::::::::::::::::::: -:::::::::::::::::::::::::::::::::::::::::::::::::::::::: +::::::::::::::::::::::::::::::::::::::::::: -## Challenges ::::::::::::::::::::::::::::::::: challenge -### Ebola's effective reproduction number adjusted by reporting delays +### Ebola's effective reproduction number was adjusted by reporting delays -Download and read the [Ebola dataset](data/ebola_cases.csv): +Using the same [Ebola dataset](data/ebola_cases.csv): -- Estimate the effective reproduction number using `{EpiNow2}` -- Adjust the estimate by the available reporting delays in `{epiparameter}` +- Reuse one additional epidemiological parameter for the `delays` argument in `EpiNow2::epinow()`. +- Estimate the effective reproduction number using `EpiNow2::epinow()`. - Why did you choose that parameter? ::::::::::::::::: hint -To calculate the $R_t$ using `{EpiNow2}`, we need: - -- Aggregated incidence `data`, with confirmed cases per day, and -- The `generation` time distribution. -- Optionally, reporting `delays` distributions when available (e.g., incubation period). +We can use two complementary delay distributions to estimate the $R_t$ at time $t$. -To get delay distribution using `{epiparameter}` we can use functions like: - -- `epidist_db()` -- `list_distributions()` -- `discretise()` -- `quantile()` +- generation time. +- incubation period and reporting delays. :::::::::::::::::::::: @@ -685,8 +810,6 @@ epidist_db(disease = "ebola") %>% ```r -# generation time --------------------------------------------------------- - # subset one distribution for the generation time ebola_serial <- epidist_db( disease = "ebola", @@ -705,8 +828,6 @@ serial_interval_ebola <- distribution = "gamma" ) -# incubation time --------------------------------------------------------- - # subset one distribution for delay of the incubation period ebola_incubation <- epidist_db( disease = "ebola", @@ -725,10 +846,8 @@ incubation_period_ebola <- distribution = "gamma" ) -# epinow ------------------------------------------------------------------ - # run epinow -epinow_estimates_egi <- epinow( +epinow_estimates <- epinow( # cases reported_cases = ebola_confirmed, # delays @@ -738,173 +857,18 @@ epinow_estimates_egi <- epinow( ``` ```{.output} -WARN [2024-04-02 21:09:53] epinow: There were 2 divergent transitions after warmup. See +WARN [2024-03-28 20:52:04] epinow: There were 10 divergent transitions after warmup. See https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup to find out why this is a problem and how to eliminate them. - -WARN [2024-04-02 21:09:53] epinow: Examine the pairs() plot to diagnose sampling problems +WARN [2024-03-28 20:52:04] epinow: Examine the pairs() plot to diagnose sampling problems - ``` ```r -plot(epinow_estimates_egi) -``` - - - -:::::::::::::::::::::::::: - -::::::::::::::::::::::::::::::::::::::::::: - - -::::::::::::::::::::::::::::::::: challenge - -### What to do with Weibull distributions? - -Use the `influenza_england_1978_school` dataset from the `{outbreaks}` package to calculate the effective reproduction number using `{EpiNow2}` adjusting by the available reporting delays in `{epiparameter}`. - -::::::::::::::::: hint - -`EpiNow2::dist_spec()` also accepts Probability Mass Functions (PMF) from any distribution family. Read the reference guide on [Specify a distribution](https://epiforecasts.io/EpiNow2/reference/dist_spec.html). - -:::::::::::::::::::::: - -::::::::::::::::: solution - - -```r -# What parameters are available for Influenza? -epidist_db(disease = "influenza") %>% - list_distributions() %>% - as_tibble() %>% - count(epi_distribution) -``` - -```{.output} -# A tibble: 3 × 2 - epi_distribution n - -1 generation time 1 -2 incubation period 15 -3 serial interval 1 -``` - -```r -# generation time --------------------------------------------------------- - -# Read the generation time -influenza_generation <- - epidist_db( - disease = "influenza", - epi_dist = "generation" - ) - -influenza_generation -``` - -```{.output} -Disease: Influenza -Pathogen: Influenza-A-H1N1 -Epi Distribution: generation time -Study: Lessler J, Reich N, Cummings D, New York City Department of Health and -Mental Hygiene Swine Influenza Investigation Team (2009). "Outbreak of -2009 Pandemic Influenza A (H1N1) at a New York City School." _The New -England Journal of Medicine_. doi:10.1056/NEJMoa0906089 -. -Distribution: weibull -Parameters: - shape: 2.360 - scale: 3.180 -``` - -```r -# EpiNow2 currently accepts Gamma or LogNormal -# other can pass the PMF function - -influenza_generation_discrete <- - epiparameter::discretise(influenza_generation) - -influenza_generation_max <- - quantile(influenza_generation_discrete, p = 0.999) - -influenza_generation_pmf <- - density( - influenza_generation_discrete, - at = 1:influenza_generation_max - ) - -influenza_generation_pmf -``` - -```{.output} -[1] 0.063123364 0.221349877 0.297212205 0.238968280 0.124851641 0.043094538 -[7] 0.009799363 -``` - -```r -# EpiNow2::dist_spec() can also accept the PMF values -generation_time_influenza <- - dist_spec( - pmf = influenza_generation_pmf - ) - -# incubation period ------------------------------------------------------- - -# Read the incubation period -influenza_incubation <- - epidist_db( - disease = "influenza", - epi_dist = "incubation", - single_epidist = TRUE - ) - -# Discretize incubation period -influenza_incubation_discrete <- - epiparameter::discretise(influenza_incubation) - -influenza_incubation_max <- - quantile(influenza_incubation_discrete, p = 0.999) - -influenza_incubation_pmf <- - density( - influenza_incubation_discrete, - at = 1:influenza_incubation_max - ) - -influenza_incubation_pmf +plot(epinow_estimates) ``` -```{.output} -[1] 0.057491512 0.166877052 0.224430917 0.215076318 0.161045462 0.097466092 -[7] 0.048419279 0.019900259 0.006795222 -``` - -```r -# EpiNow2::dist_spec() can also accept the PMF values -incubation_time_influenza <- - dist_spec( - pmf = influenza_incubation_pmf - ) - -# epinow ------------------------------------------------------------------ - -# Read data -influenza_cleaned <- - outbreaks::influenza_england_1978_school %>% - select(date, confirm = in_bed) - -# Run epinow() -epinow_estimates_igi <- epinow( - # cases - reported_cases = influenza_cleaned, - # delays - generation_time = generation_time_opts(generation_time_influenza), - delays = delay_opts(incubation_time_influenza) -) - -plot(epinow_estimates_igi) -``` - - + :::::::::::::::::::::::::: @@ -918,7 +882,7 @@ plot(epinow_estimates_igi) How to get the mean and standard deviation from a generation time with *only* distribution parameters but no summary statistics like `mean` or `sd` for `EpiNow2::dist_spec()`? -Look at the `{epiparameter}` vignette on [parameter extraction and conversion](https://epiverse-trace.github.io/epiparameter/articles/extract_convert.html) and its [use cases](https://epiverse-trace.github.io/epiparameter/articles/extract_convert.html#use-cases)! +Look at the `{epiparameter}` vignette on [parameter extraction and conversion](https://epiverse-trace.github.io/epiparameter/articles/extract_convert.html)! ::::::::::::::::::::::::::::: @@ -935,7 +899,6 @@ Refer to this excellent tutorial on estimating the serial interval and incubatio ::::::::::::::::::::::::::::: - @@ -100,21 +99,6 @@ epinow_estimates <- epinow( ``` --> -## Find a Generation time - -The generation time, jointly with the $R$, can inform about the speed of spread and its feasibility of control. Given a $R>1$, with a shorter generation time, cases can appear more quickly. - -![Video from the MRC Centre for Global Infectious Disease Analysis, Ep 76. Science In Context - Epi Parameter Review Group with Dr Anne Cori (27-07-2023) at ](fig/reproduction-generation-time.png) - -In calculating the effective reproduction number ($R_{t}$), the *generation time* distribution is often approximated by the [serial interval](../learners/reference.md#serialinterval) distribution. -This frequent approximation is because it is easier to observe and measure the onset of symptoms than the onset of infectiousness. - -![A schematic of the relationship of different time periods of transmission between an infector and an infectee in a transmission pair. Exposure window is defined as the time interval having viral exposure, and transmission window is defined as the time interval for onward transmission with respect to the infection time ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)).](fig/serial-interval-observed.jpeg) - -However, using the *serial interval* as an approximation of the *generation time* is primarily valid for diseases in which infectiousness starts after symptom onset ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)). In cases where infectiousness starts before symptom onset, the serial intervals can have negative values, which is the case of a pre-symptomatic transmission ([Nishiura et al., 2020](https://www.ijidonline.com/article/S1201-9712(20)30119-3/fulltext#gr2)). - -Additionally, even if the *generation time* and *serial interval* have the same mean, their variance usually differs, propagating bias to the $R_{t}$ estimation. $R_{t}$ estimates are sensitive not only to the mean generation time but also to the variance and form of the generation interval distribution [(Gostic et al., 2020)](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008409). - ::::::::::::::::: callout ### From time periods to probability distributions. @@ -150,6 +134,21 @@ Table: Serial interval estimates using Gamma, Weibull, and Log normal distributi ::::::::::::::::::::::::: +## Find a Generation time + +The generation time, jointly with the $R$, can inform about the speed of spread and its feasibility of control. Given a $R>1$, with a shorter generation time, cases can appear more quickly. + +![Video from the MRC Centre for Global Infectious Disease Analysis, Ep 76. Science In Context - Epi Parameter Review Group with Dr Anne Cori (27-07-2023) at ](fig/reproduction-generation-time.png) + +In calculating the effective reproduction number ($R_{t}$), the *generation time* distribution is often approximated by the [serial interval](../learners/reference.md#serialinterval) distribution. +This frequent approximation is because it is easier to observe and measure the onset of symptoms than the onset of infectiousness. + +![A schematic of the relationship of different time periods of transmission between an infector and an infectee in a transmission pair. Exposure window is defined as the time interval having viral exposure, and transmission window is defined as the time interval for onward transmission with respect to the infection time ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)).](fig/serial-interval-observed.jpeg) + +However, using the *serial interval* as an approximation of the *generation time* is primarily valid for diseases in which infectiousness starts after symptom onset ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)). In cases where infectiousness starts before symptom onset, the serial intervals can have negative values, which is the case of a pre-symptomatic transmission ([Nishiura et al., 2020](https://www.ijidonline.com/article/S1201-9712(20)30119-3/fulltext#gr2)). + +Additionally, even if the *generation time* and *serial interval* have the same mean, their variance usually differs, propagating bias to the $R_{t}$ estimation. $R_{t}$ estimates are sensitive not only to the mean generation time but also to the variance and form of the generation interval distribution [(Gostic et al., 2020)](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008409). + ::::::::::::::::::::::::::::::::: challenge ### Serial interval @@ -300,7 +299,7 @@ In the `epiparameter::list_distributions()` output, we can also find different t ::::::::::::::::: spoiler -### Why do we have a 'NA' entry? +### Why do we have a `` entry? Entries with a missing value (``) in the `prob_distribution` column are *non-parameterised* entries. They have summary statistics but no probability distribution. Compare these two outputs: @@ -342,48 +341,34 @@ distribution[[4]]$metadata$inference_method ::::::::::::::::::::::::::::::::: challenge -### Find your delay distributions! - -Take 2 minutes to explore the `{epiparameter}` library. - -**Choose** a disease of interest (e.g., Influenza, Measles, etc.) and a delay distribution (e.g., the incubation period, onset to death, etc.). +### Ebola's incubation periods -Find: +Take 5 minutes to explore the `{epiparameter}` library. -- How many delay distributions are for that disease? +First, search for Ebola disease delay distributions. Find: -- How many types of probability distribution (e.g., gamma, lognormal) are for a given delay in that disease? +- How many delay distributions are for the Ebola disease? -Ask: - -- Do you recognise the papers? - -- Should `{epiparameter}` literature review consider any other paper? +- How many types of delay distributions are for the incubation period of Ebola? ::::::::::::::::: hint -The `epidist_db()` function with `disease` alone counts the number of entries like: +`epidist_db()` and `list_distributions()` give us different and complementary summary outputs. + +The `epidist_db()` function alone counts for us the number of entries like: - studies, and - delay distributions. -The `epidist_db()` function with `disease` and `epi_dist` gets a list of all entries with: +On the other hand, the `{epiparameter}` combo of `epidist_db()` plus `list_distributions()` lists all the entries in a data frame with columns like: -- the complete citation, -- the **type** of a probability distribution, and -- distribution parameter values. - -The combo of `epidist_db()` plus `list_distributions()` gets a data frame of all entries with columns like: - -- the **type** of the probability distribution per delay, and +- the type of the probability distribution per delay, and - author and year of the study. :::::::::::::::::::::: ::::::::::::::::: solution -We choose to explore Ebola's delay distributions: - ```r # we expect 16 delays distributions for ebola @@ -409,18 +394,6 @@ List of objects Now, from the output of `epiparameter::epidist_db()`, What is an [offspring distribution](../learners/reference.md#offspringdist)? -We choose to find Ebola's incubation periods. This output list all the papers and parameters found. Run this locally if needed: - - -```r -epiparameter::epidist_db( - disease = "ebola", - epi_dist = "incubation" -) -``` - -We use `list_distributions()` to get a summary display of all: - ```r # we expect 2 different types of delay distributions @@ -449,6 +422,12 @@ To retrieve the short citation for each use the 'get_citation' function We find two types of probability distributions for this query: _lognormal_ and _gamma_. +Now, search for delay distributions of your disease of interest! Ask: + +- Do you recognise the papers? + +- Should it consider any other paper? + How does `{epiparameter}` do the collection and review of peer-reviewed literature? We invite you to read the vignette on ["Data Collation and Synthesis Protocol"](https://epiverse-trace.github.io/epiparameter/articles/data_protocol.html)! :::::::::::::::::::::::::: @@ -518,7 +497,7 @@ Parameters: ::::::::::::::::: callout -### How does 'single_epidist' works? +### How does `single_epidist` works? Looking at the help documentation for `?epiparameter::epidist_db()`: @@ -545,6 +524,32 @@ covid_serialint <- ) ``` +```{.output} +Using Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel +coronavirus (COVID-19) infections." _International Journal of +Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 +.. +To retrieve the short citation use the 'get_citation' function +``` + +```r +covid_serialint +``` + +```{.output} +Disease: COVID-19 +Pathogen: SARS-CoV-2 +Epi Distribution: serial interval +Study: Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel +coronavirus (COVID-19) infections." _International Journal of +Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 +. +Distribution: lnorm +Parameters: + meanlog: 1.386 + sdlog: 0.568 +``` +