From 93b96d0c8630ab4e6bd484360c9fda9a9b5ed0ba Mon Sep 17 00:00:00 2001 From: GitHub Actions Date: Tue, 2 Apr 2024 20:55:35 +0000 Subject: [PATCH] differences for PR #43 --- delays-functions.md | 452 ++++++++++-------- delays-reuse.md | 229 +++++---- ...-functions-rendered-unnamed-chunk-16-1.png | Bin 0 -> 50976 bytes ...-functions-rendered-unnamed-chunk-17-1.png | Bin 0 -> 52139 bytes ...-functions-rendered-unnamed-chunk-20-1.png | Bin 52059 -> 51341 bytes ...-functions-rendered-unnamed-chunk-21-1.png | Bin 0 -> 32834 bytes ...lays-reuse-rendered-unnamed-chunk-15-1.png | Bin 0 -> 11595 bytes md5sum.txt | 4 +- 8 files changed, 370 insertions(+), 315 deletions(-) create mode 100644 fig/delays-functions-rendered-unnamed-chunk-16-1.png create mode 100644 fig/delays-functions-rendered-unnamed-chunk-17-1.png create mode 100644 fig/delays-functions-rendered-unnamed-chunk-21-1.png create mode 100644 fig/delays-reuse-rendered-unnamed-chunk-15-1.png diff --git a/delays-functions.md b/delays-functions.md index f021a8b7..6afca31e 100644 --- a/delays-functions.md +++ b/delays-functions.md @@ -1,5 +1,5 @@ --- -title: 'Input delay data' +title: 'Use delay distributions in analysis' teaching: 10 exercises: 2 editor_options: @@ -63,44 +63,19 @@ covid_serialint <- ) ``` -```{.output} -Using Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel -coronavirus (COVID-19) infections." _International Journal of -Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 -.. -To retrieve the short citation use the 'get_citation' function -``` +Now, we have an epidemiological parameter we can use in our analysis! In the chunk below we replaced one of the **summary statistics** inputs into `EpiNow2::dist_spec()` ```r -covid_serialint -``` - -```{.output} -Disease: COVID-19 -Pathogen: SARS-CoV-2 -Epi Distribution: serial interval -Study: Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel -coronavirus (COVID-19) infections." _International Journal of -Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 -. -Distribution: lnorm -Parameters: - meanlog: 1.386 - sdlog: 0.568 -``` - -Now, we have an epidemiological parameter we can reuse! We can replace the two out of three **summary statistics** into `EpiNow2::dist_spec()` - -```r -generation_time <- dist_spec( - mean = covid_serialint$summary_stats$mean, - sd = covid_serialint$summary_stats$sd, - max = 20, - distribution = "gamma" -) +generation_time <- + EpiNow2::dist_spec( + mean = covid_serialint$summary_stats$mean, # we changed this line :) + sd = 2, + max = 20, + distribution = "gamma" + ) ``` -In this episode, we will use the **distribution functions** that `{epiparameter}` provides to get a `max` value for this and any other package downstream in the pipeline! +In this episode, we will use the **distribution functions** that `{epiparameter}` provides to get a maximum value (`max`) for this and any other package downstream in your analysis pipeline! Let's load the `{epiparameter}` and `{EpiNow2}` package. For `{EpiNow2}`, we'll set 4 cores to be used in parallel computations. We'll use the pipe `%>%`, some `{dplyr}` verbs and `{ggplot2}`, so let's also call to the `{tidyverse}` package: @@ -175,8 +150,8 @@ generate(covid_serialint, times = 10) ``` ```{.output} - [1] 4.946173 1.848653 4.558656 4.051608 8.892126 3.296900 4.339645 5.280796 - [9] 6.350389 5.440831 + [1] 5.191450 5.366683 3.084088 2.118721 2.883400 1.647230 1.680940 4.564112 + [9] 5.776962 2.496665 ``` ::::::::: instructor @@ -293,7 +268,7 @@ Parameters: sdlog: 0.568 ``` -We identify this change in the `Distribution:` output line of the `` object. Take a double check to this line: +We identify this change in the `Distribution:` output line of the `` object. Double check this line: ``` Distribution: discrete lnorm @@ -435,40 +410,24 @@ quantile(covid_serialint_discrete, p = 0.999) %>% ::::::::::::::::::::::::::::::::::::::::::: -:::::::::::::::::::::::::::::: callout -### Log normal distributions +## Plug-in `{epiparameter}` to `{EpiNow2}` + +Now we can plug everything into the `EpiNow2::dist_spec()` function! + +- the **summary statistics** `mean` and `sd` of the distribution, +- a maximum value `max`, +- the `distribution` name. -If you need the log normal **distribution parameters** instead of the summary statistics, we can use `epiparameter::get_parameters()`: +But, before, in `EpiNow2::dist_spec()` for a **Lognormal** distribution we need the *distribution parameters* instead of the summary statistics: ```r covid_serialint_parameters <- epiparameter::get_parameters(covid_serialint) - -covid_serialint_parameters -``` - -```{.output} - meanlog sdlog -1.3862617 0.5679803 ``` -This gets a vector of class `` ready to use as input for any other package! - -**BONUS TIP:** If we write the `[]` next to the last object create like in `covid_serialint_parameters[]`, within `[]` we can use the -Tab key -to use the [code completion feature](https://support.posit.co/hc/en-us/articles/205273297-Code-Completion-in-the-RStudio-IDE) and have a quick access to `covid_serialint_parameters["meanlog"]` and `covid_serialint_parameters["sdlog"]`. We invite you to try this out in code chunks and the R console! - -:::::::::::::::::::::::::::::: - -## Plug-in `{epiparameter}` to `{EpiNow2}` - -Now we can plug everything into the `EpiNow2::dist_spec()` function! - -- the **summary statistics** `mean` and `sd` of the distribution, -- a maximum value `max`, -- the `distribution` name. +Then, we have: ```r @@ -488,119 +447,53 @@ serial_interval_covid Fixed distribution with PMF [0.0073 0.1 0.2 0.19 0.15 0.11 0.075 0.051 0.035 0.023 0.016 0.011 0.0076 0.0053 0.0037 0.0027 0.0019 0.0014 0.001 0.00074 0.00055 0.00041 0.00031] ``` -:::::::::: callout +:::::::::::::::::::::::::::::: callout -### Warning +### A code completion tip -Using the serial interval instead of the generation time is an alternative that can propagate bias in your estimates, even more so in diseases with reported pre-symptomatic transmission. ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)) +If we write the `[]` next to the object `covid_serialint_parameters[]`, within `[]` we can use the +Tab key +for [code completion feature](https://support.posit.co/hc/en-us/articles/205273297-Code-Completion-in-the-RStudio-IDE) -:::::::::::::::::: +This gives quick access to `covid_serialint_parameters["meanlog"]` and `covid_serialint_parameters["sdlog"]`. + +We invite you to try this out in code chunks and the R console! + +:::::::::::::::::::::::::::::: Let's replace the `generation_time` input we used for `EpiNow2::epinow()`. ```r -epinow_estimates <- epinow( +epinow_estimates_cg <- epinow( # cases reported_cases = example_confirmed[1:60], # delays generation_time = generation_time_opts(serial_interval_covid) ) - -base::plot(epinow_estimates) -``` - -::::::::::::::::::::::::::::::::: challenge - -### Ebola's effective reproduction number - -Download and read the [Ebola dataset](data/ebola_cases.csv): - -- Reuse one epidemiological parameter to estimate the effective reproduction number for the Ebola dataset. -- Why did you choose that parameter? - -::::::::::::::::: hint - -To calculate the $R_t$, we need: - -- data set with confirmed cases per day and -- one key delay distribution - -Key functions we applied in this episode are: - -- `epidist_db()` -- `list_distributions()` -- `discretise()` -- probability functions for continuous and discrete distributions - -:::::::::::::::::::::: - -::::::::::::::::: solution - - - - -```r -# read data -# e.g.: if path to file is data/raw-data/ebola_cases.csv then: -ebola_confirmed <- - read_csv(here::here("data", "raw-data", "ebola_cases.csv")) - -# list distributions -epidist_db(disease = "ebola") %>% - list_distributions() -``` - - -```r -# subset one distribution -ebola_serial <- epidist_db( - disease = "ebola", - epi_dist = "serial", - single_epidist = TRUE -) - -# adapt epiparameter to epinow2 -ebola_serial_discrete <- discretise(ebola_serial) - -ebola_serial_discrete_max <- quantile(ebola_serial_discrete, p = 0.999) - -serial_interval_ebola <- - dist_spec( - mean = ebola_serial$summary_stats$mean, - sd = ebola_serial$summary_stats$sd, - max = ebola_serial_discrete_max, - distribution = "gamma" # don't forget! it's a must! - ) - -# run epinow -epinow_estimates <- epinow( - # cases - reported_cases = ebola_confirmed, - # delays - generation_time = generation_time_opts(serial_interval_ebola) -) ``` ```{.output} -WARN [2024-03-28 20:46:51] epinow: There were 8 divergent transitions after warmup. See +WARN [2024-04-02 20:48:14] epinow: There were 14 divergent transitions after warmup. See https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup to find out why this is a problem and how to eliminate them. - -WARN [2024-03-28 20:46:51] epinow: Examine the pairs() plot to diagnose sampling problems +WARN [2024-04-02 20:48:14] epinow: Examine the pairs() plot to diagnose sampling problems - ``` ```r -plot(epinow_estimates) +base::plot(epinow_estimates_cg) ``` - + -`{EpiNow2}` can also include the uncertainty around each summary statistic. We invite you to read this discussion on: [How to adapt `{epiparameter}` uncertainty entries to `{EpiNow2}`](https://github.com/epiverse-trace/epiparameter/discussions/218)? +:::::::::: callout -:::::::::::::::::::::::::: +### Warning -::::::::::::::::::::::::::::::::::::::::::: +Using the serial interval instead of the generation time is an alternative that can propagate bias in your estimates, even more so in diseases with reported pre-symptomatic transmission. ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)) + +:::::::::::::::::: ## Adjusting for reporting delays @@ -643,6 +536,8 @@ epinow_estimates <- epinow( ```r +# generation time --------------------------------------------------------- + # get covid serial interval covid_serialint <- epiparameter::epidist_db( @@ -651,17 +546,7 @@ covid_serialint <- author = "Nishiura", single_epidist = TRUE ) -``` - -```{.output} -Using Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel -coronavirus (COVID-19) infections." _International Journal of -Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 -.. -To retrieve the short citation use the 'get_citation' function -``` -```r # adapt epidist to epinow2 covid_serialint_discrete_max <- covid_serialint %>% @@ -679,6 +564,8 @@ covid_serial_interval <- distribution = "lognormal" ) +# incubation time --------------------------------------------------------- + # get covid incubation period covid_incubation <- epiparameter::epidist_db( disease = "covid", @@ -686,19 +573,7 @@ covid_incubation <- epiparameter::epidist_db( author = "Natalie", single_epidist = TRUE ) -``` -```{.output} -Using Linton N, Kobayashi T, Yang Y, Hayashi K, Akhmetzhanov A, Jung S, Yuan -B, Kinoshita R, Nishiura H (2020). "Incubation Period and Other -Epidemiological Characteristics of 2019 Novel Coronavirus Infections -with Right Truncation: A Statistical Analysis of Publicly Available -Case Data." _Journal of Clinical Medicine_. doi:10.3390/jcm9020538 -.. -To retrieve the short citation use the 'get_citation' function -``` - -```r # adapt epiparameter to epinow2 covid_incubation_discrete_max <- covid_incubation %>% @@ -716,8 +591,10 @@ covid_incubation_time <- distribution = "lognormal" # do not forget this! ) +# epinow ------------------------------------------------------------------ + # run epinow -epinow_estimates <- epinow( +epinow_estimates_cgi <- epinow( # cases reported_cases = example_confirmed[1:60], # delays @@ -727,68 +604,63 @@ epinow_estimates <- epinow( ``` ```{.output} -Logging threshold set at INFO for the EpiNow2 logger -``` - -```{.output} -Writing EpiNow2 logs to the console and: /tmp/Rtmp0E3eaa/regional-epinow/2020-04-21.log -``` - -```{.output} -Logging threshold set at INFO for the EpiNow2.epinow logger -``` - -```{.output} -Writing EpiNow2.epinow logs to the console and: /tmp/Rtmp0E3eaa/epinow/2020-04-21.log -``` - -```{.output} -WARN [2024-03-28 20:48:37] epinow: There were 8 divergent transitions after warmup. See +WARN [2024-04-02 20:50:09] epinow: There were 4 divergent transitions after warmup. See https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup to find out why this is a problem and how to eliminate them. - -WARN [2024-03-28 20:48:37] epinow: Examine the pairs() plot to diagnose sampling problems +WARN [2024-04-02 20:50:09] epinow: Examine the pairs() plot to diagnose sampling problems - ``` ```r -base::plot(epinow_estimates) +base::plot(epinow_estimates_cgi) ``` - + :::::::::::::::::::::::::: -:::::::::::::: solution +::::::::::::::::::::::::::::::::::::::::::: + +:::::::::::::::::::::::::::::::::::::::::::::::::::::::: discussion ### How much has it changed? After adding the incubation period, discuss: -- Does the retrospective trend of forecast change? +- Does the trend of the model fit in the "Estimate" section change? - Has the uncertainty changed? - How would you explain or interpret any of these changes? -:::::::::::::::::::::::::::: +Compare the `{EpiNow2}` figures generated previously. -::::::::::::::::::::::::::::::::::::::::::: +:::::::::::::::::::::::::::::::::::::::::::::::::::::::: +## Challenges ::::::::::::::::::::::::::::::::: challenge -### Ebola's effective reproduction number was adjusted by reporting delays +### Ebola's effective reproduction number adjusted by reporting delays -Using the same [Ebola dataset](data/ebola_cases.csv): +Download and read the [Ebola dataset](data/ebola_cases.csv): -- Reuse one additional epidemiological parameter for the `delays` argument in `EpiNow2::epinow()`. -- Estimate the effective reproduction number using `EpiNow2::epinow()`. +- Estimate the effective reproduction number using `{EpiNow2}` +- Adjust the estimate by the available reporting delays in `{epiparameter}` - Why did you choose that parameter? ::::::::::::::::: hint -We can use two complementary delay distributions to estimate the $R_t$ at time $t$. +To calculate the $R_t$ using `{EpiNow2}`, we need: + +- Aggregated incidence `data`, with confirmed cases per day, and +- The `generation` time distribution. +- Optionally, reporting `delays` distributions when available (e.g., incubation period). -- generation time. -- incubation period and reporting delays. +To get delay distribution using `{epiparameter}` we can use functions like: + +- `epidist_db()` +- `list_distributions()` +- `discretise()` +- `quantile()` :::::::::::::::::::::: @@ -810,6 +682,8 @@ epidist_db(disease = "ebola") %>% ```r +# generation time --------------------------------------------------------- + # subset one distribution for the generation time ebola_serial <- epidist_db( disease = "ebola", @@ -828,6 +702,8 @@ serial_interval_ebola <- distribution = "gamma" ) +# incubation time --------------------------------------------------------- + # subset one distribution for delay of the incubation period ebola_incubation <- epidist_db( disease = "ebola", @@ -846,8 +722,10 @@ incubation_period_ebola <- distribution = "gamma" ) +# epinow ------------------------------------------------------------------ + # run epinow -epinow_estimates <- epinow( +epinow_estimates_egi <- epinow( # cases reported_cases = ebola_confirmed, # delays @@ -857,18 +735,173 @@ epinow_estimates <- epinow( ``` ```{.output} -WARN [2024-03-28 20:52:04] epinow: There were 10 divergent transitions after warmup. See +WARN [2024-04-02 20:53:45] epinow: There were 7 divergent transitions after warmup. See https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup to find out why this is a problem and how to eliminate them. - -WARN [2024-03-28 20:52:04] epinow: Examine the pairs() plot to diagnose sampling problems +WARN [2024-04-02 20:53:45] epinow: Examine the pairs() plot to diagnose sampling problems - ``` ```r -plot(epinow_estimates) +plot(epinow_estimates_egi) +``` + + + +:::::::::::::::::::::::::: + +::::::::::::::::::::::::::::::::::::::::::: + + +::::::::::::::::::::::::::::::::: challenge + +### What to do with Weibull distributions? + +Use the `influenza_england_1978_school` dataset from the `{outbreaks}` package to calculate the effective reproduction number using `{EpiNow2}` adjusting by the available reporting delays in `{epiparameter}`. + +::::::::::::::::: hint + +`EpiNow2::dist_spec()` also accepts Probability Mass Functions (PMF) from any distribution family. Read the reference guide on [Specify a distribution](https://epiforecasts.io/EpiNow2/reference/dist_spec.html). + +:::::::::::::::::::::: + +::::::::::::::::: solution + + +```r +# What parameters are available for Influenza? +epidist_db(disease = "influenza") %>% + list_distributions() %>% + as_tibble() %>% + count(epi_distribution) +``` + +```{.output} +# A tibble: 3 × 2 + epi_distribution n + +1 generation time 1 +2 incubation period 15 +3 serial interval 1 +``` + +```r +# generation time --------------------------------------------------------- + +# Read the generation time +influenza_generation <- + epidist_db( + disease = "influenza", + epi_dist = "generation" + ) + +influenza_generation +``` + +```{.output} +Disease: Influenza +Pathogen: Influenza-A-H1N1 +Epi Distribution: generation time +Study: Lessler J, Reich N, Cummings D, New York City Department of Health and +Mental Hygiene Swine Influenza Investigation Team (2009). "Outbreak of +2009 Pandemic Influenza A (H1N1) at a New York City School." _The New +England Journal of Medicine_. doi:10.1056/NEJMoa0906089 +. +Distribution: weibull +Parameters: + shape: 2.360 + scale: 3.180 +``` + +```r +# EpiNow2 currently accepts Gamma or LogNormal +# other can pass the PMF function + +influenza_generation_discrete <- + epiparameter::discretise(influenza_generation) + +influenza_generation_max <- + quantile(influenza_generation_discrete, p = 0.999) + +influenza_generation_pmf <- + density( + influenza_generation_discrete, + at = 1:influenza_generation_max + ) + +influenza_generation_pmf +``` + +```{.output} +[1] 0.063123364 0.221349877 0.297212205 0.238968280 0.124851641 0.043094538 +[7] 0.009799363 +``` + +```r +# EpiNow2::dist_spec() can also accept the PMF values +generation_time_influenza <- + dist_spec( + pmf = influenza_generation_pmf + ) + +# incubation period ------------------------------------------------------- + +# Read the incubation period +influenza_incubation <- + epidist_db( + disease = "influenza", + epi_dist = "incubation", + single_epidist = TRUE + ) + +# Discretize incubation period +influenza_incubation_discrete <- + epiparameter::discretise(influenza_incubation) + +influenza_incubation_max <- + quantile(influenza_incubation_discrete, p = 0.999) + +influenza_incubation_pmf <- + density( + influenza_incubation_discrete, + at = 1:influenza_incubation_max + ) + +influenza_incubation_pmf ``` - +```{.output} +[1] 0.057491512 0.166877052 0.224430917 0.215076318 0.161045462 0.097466092 +[7] 0.048419279 0.019900259 0.006795222 +``` + +```r +# EpiNow2::dist_spec() can also accept the PMF values +incubation_time_influenza <- + dist_spec( + pmf = influenza_incubation_pmf + ) + +# epinow ------------------------------------------------------------------ + +# Read data +influenza_cleaned <- + outbreaks::influenza_england_1978_school %>% + select(date, confirm = in_bed) + +# Run epinow() +epinow_estimates_igi <- epinow( + # cases + reported_cases = influenza_cleaned, + # delays + generation_time = generation_time_opts(generation_time_influenza), + delays = delay_opts(incubation_time_influenza) +) + +plot(epinow_estimates_igi) +``` + + :::::::::::::::::::::::::: @@ -882,7 +915,7 @@ plot(epinow_estimates) How to get the mean and standard deviation from a generation time with *only* distribution parameters but no summary statistics like `mean` or `sd` for `EpiNow2::dist_spec()`? -Look at the `{epiparameter}` vignette on [parameter extraction and conversion](https://epiverse-trace.github.io/epiparameter/articles/extract_convert.html)! +Look at the `{epiparameter}` vignette on [parameter extraction and conversion](https://epiverse-trace.github.io/epiparameter/articles/extract_convert.html) and its [use cases](https://epiverse-trace.github.io/epiparameter/articles/extract_convert.html#use-cases)! ::::::::::::::::::::::::::::: @@ -899,6 +932,7 @@ Refer to this excellent tutorial on estimating the serial interval and incubatio ::::::::::::::::::::::::::::: + @@ -99,6 +100,21 @@ epinow_estimates <- epinow( ``` --> +## Find a Generation time + +The generation time, jointly with the $R$, can inform about the speed of spread and its feasibility of control. Given a $R>1$, with a shorter generation time, cases can appear more quickly. + +![Video from the MRC Centre for Global Infectious Disease Analysis, Ep 76. Science In Context - Epi Parameter Review Group with Dr Anne Cori (27-07-2023) at ](fig/reproduction-generation-time.png) + +In calculating the effective reproduction number ($R_{t}$), the *generation time* distribution is often approximated by the [serial interval](../learners/reference.md#serialinterval) distribution. +This frequent approximation is because it is easier to observe and measure the onset of symptoms than the onset of infectiousness. + +![A schematic of the relationship of different time periods of transmission between an infector and an infectee in a transmission pair. Exposure window is defined as the time interval having viral exposure, and transmission window is defined as the time interval for onward transmission with respect to the infection time ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)).](fig/serial-interval-observed.jpeg) + +However, using the *serial interval* as an approximation of the *generation time* is primarily valid for diseases in which infectiousness starts after symptom onset ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)). In cases where infectiousness starts before symptom onset, the serial intervals can have negative values, which is the case of a pre-symptomatic transmission ([Nishiura et al., 2020](https://www.ijidonline.com/article/S1201-9712(20)30119-3/fulltext#gr2)). + +Additionally, even if the *generation time* and *serial interval* have the same mean, their variance usually differs, propagating bias to the $R_{t}$ estimation. $R_{t}$ estimates are sensitive not only to the mean generation time but also to the variance and form of the generation interval distribution [(Gostic et al., 2020)](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008409). + ::::::::::::::::: callout ### From time periods to probability distributions. @@ -134,21 +150,6 @@ Table: Serial interval estimates using Gamma, Weibull, and Log normal distributi ::::::::::::::::::::::::: -## Find a Generation time - -The generation time, jointly with the $R$, can inform about the speed of spread and its feasibility of control. Given a $R>1$, with a shorter generation time, cases can appear more quickly. - -![Video from the MRC Centre for Global Infectious Disease Analysis, Ep 76. Science In Context - Epi Parameter Review Group with Dr Anne Cori (27-07-2023) at ](fig/reproduction-generation-time.png) - -In calculating the effective reproduction number ($R_{t}$), the *generation time* distribution is often approximated by the [serial interval](../learners/reference.md#serialinterval) distribution. -This frequent approximation is because it is easier to observe and measure the onset of symptoms than the onset of infectiousness. - -![A schematic of the relationship of different time periods of transmission between an infector and an infectee in a transmission pair. Exposure window is defined as the time interval having viral exposure, and transmission window is defined as the time interval for onward transmission with respect to the infection time ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)).](fig/serial-interval-observed.jpeg) - -However, using the *serial interval* as an approximation of the *generation time* is primarily valid for diseases in which infectiousness starts after symptom onset ([Chung Lau et al., 2021](https://academic.oup.com/jid/article/224/10/1664/6356465)). In cases where infectiousness starts before symptom onset, the serial intervals can have negative values, which is the case of a pre-symptomatic transmission ([Nishiura et al., 2020](https://www.ijidonline.com/article/S1201-9712(20)30119-3/fulltext#gr2)). - -Additionally, even if the *generation time* and *serial interval* have the same mean, their variance usually differs, propagating bias to the $R_{t}$ estimation. $R_{t}$ estimates are sensitive not only to the mean generation time but also to the variance and form of the generation interval distribution [(Gostic et al., 2020)](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008409). - ::::::::::::::::::::::::::::::::: challenge ### Serial interval @@ -299,7 +300,7 @@ In the `epiparameter::list_distributions()` output, we can also find different t ::::::::::::::::: spoiler -### Why do we have a `` entry? +### Why do we have a 'NA' entry? Entries with a missing value (``) in the `prob_distribution` column are *non-parameterised* entries. They have summary statistics but no probability distribution. Compare these two outputs: @@ -341,34 +342,48 @@ distribution[[4]]$metadata$inference_method ::::::::::::::::::::::::::::::::: challenge -### Ebola's incubation periods +### Find your delay distributions! -Take 5 minutes to explore the `{epiparameter}` library. +Take 2 minutes to explore the `{epiparameter}` library. -First, search for Ebola disease delay distributions. Find: +**Choose** a disease of interest (e.g., Influenza, Measles, etc.) and a delay distribution (e.g., the incubation period, onset to death, etc.). -- How many delay distributions are for the Ebola disease? +Find: -- How many types of delay distributions are for the incubation period of Ebola? +- How many delay distributions are for that disease? -::::::::::::::::: hint +- How many types of probability distribution (e.g., gamma, lognormal) are for a given delay in that disease? + +Ask: -`epidist_db()` and `list_distributions()` give us different and complementary summary outputs. +- Do you recognise the papers? + +- Should `{epiparameter}` literature review consider any other paper? + +::::::::::::::::: hint -The `epidist_db()` function alone counts for us the number of entries like: +The `epidist_db()` function with `disease` alone counts the number of entries like: - studies, and - delay distributions. -On the other hand, the `{epiparameter}` combo of `epidist_db()` plus `list_distributions()` lists all the entries in a data frame with columns like: +The `epidist_db()` function with `disease` and `epi_dist` gets a list of all entries with: -- the type of the probability distribution per delay, and +- the complete citation, +- the **type** of a probability distribution, and +- distribution parameter values. + +The combo of `epidist_db()` plus `list_distributions()` gets a data frame of all entries with columns like: + +- the **type** of the probability distribution per delay, and - author and year of the study. :::::::::::::::::::::: ::::::::::::::::: solution +We choose to explore Ebola's delay distributions: + ```r # we expect 16 delays distributions for ebola @@ -394,6 +409,18 @@ List of objects Now, from the output of `epiparameter::epidist_db()`, What is an [offspring distribution](../learners/reference.md#offspringdist)? +We choose to find Ebola's incubation periods. This output list all the papers and parameters found. Run this locally if needed: + + +```r +epiparameter::epidist_db( + disease = "ebola", + epi_dist = "incubation" +) +``` + +We use `list_distributions()` to get a summary display of all: + ```r # we expect 2 different types of delay distributions @@ -422,12 +449,6 @@ To retrieve the short citation for each use the 'get_citation' function We find two types of probability distributions for this query: _lognormal_ and _gamma_. -Now, search for delay distributions of your disease of interest! Ask: - -- Do you recognise the papers? - -- Should it consider any other paper? - How does `{epiparameter}` do the collection and review of peer-reviewed literature? We invite you to read the vignette on ["Data Collation and Synthesis Protocol"](https://epiverse-trace.github.io/epiparameter/articles/data_protocol.html)! :::::::::::::::::::::::::: @@ -497,7 +518,7 @@ Parameters: ::::::::::::::::: callout -### How does `single_epidist` works? +### How does 'single_epidist' works? Looking at the help documentation for `?epiparameter::epidist_db()`: @@ -524,32 +545,6 @@ covid_serialint <- ) ``` -```{.output} -Using Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel -coronavirus (COVID-19) infections." _International Journal of -Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 -.. -To retrieve the short citation use the 'get_citation' function -``` - -```r -covid_serialint -``` - -```{.output} -Disease: COVID-19 -Pathogen: SARS-CoV-2 -Epi Distribution: serial interval -Study: Nishiura H, Linton N, Akhmetzhanov A (2020). "Serial interval of novel -coronavirus (COVID-19) infections." _International Journal of -Infectious Diseases_. doi:10.1016/j.ijid.2020.02.060 -. -Distribution: lnorm -Parameters: - meanlog: 1.386 - sdlog: 0.568 -``` - nG)i#XIobp~T<`@ucZ0Doppy7RAIZ>YE{sZO^Gv12kK~^iu~cqcF9>JKbU` zrYCoAwCrFQN)%Pk*N_;Wr}oJa--~~?y^5q#T8fJC;bTiydKqeRj0N_>_Ltv(kut_y zdGGSyh1{}7B^NC82OGK zpg^09XNi*o+6dLEC13-}nWQWQiJ_?U`_C>U z*fPM}1m#9!l49`9SiyHu9D1ORAuOBpHP~k}7@>ZL^E#QGZMX_gAH91+nOZ|a50G^@ zALz?ILM7KvNg&D;slBv92UdqNV{RGbcRYjeu8FyRqHafU1Kxe2S_%h>f*vJ$XY+v zt(L-ErF;B@*zy9r<9Lc!&^$N^W1(5NiC<-ARH?>&!?V*IT|33F{bV++pQ5%mr5K>v z^6T4U3NKxvo?H052rr+pYX-72B`6=;9QqE~jkQ;qai+`wm))ZaS;m)Y!K5YaybZX* z3TC&3H*@Wn*X#2%<=Uz&a8gi}7BdmYo`0C~0t7EL<)Bqi%rE5*H|BxLjz@s-=SVRQ z^jOiEyOa>u`nIq^Csc~_pS%g)Ikf}u^hBJyEae8}f|mrBBDX7Q&fGVln`5B}M$xXE z6T3f5`wJoUP@H#vhoH3Jy-v$cid|hROU`SI&Lr2t(3Sua_<$iO$-v3DQ0sE?9i?s#6-#5l}&ce%73fN`vXD8pGs^>!x zh8`4YW(^aKqhoHd_moO#-G;IQ`4K+V#;T*617jt_F6u&RBL{AJ= z_3s>SHm)?|N=Z!#meMFm^aPI%n)2qAIniN|KW4RU+c~MgT1XZ$Oc$JPS9AMlG#eu zFVc{ai|-yL7ohLX1vc&?=mHff7G*wK6^#?-o9v>66B=;FKcc#l43nV5sKz4lC8FyL}%C!Aa@G@8a(DvCTlej{jr-G}l z-v@`kiq>k1fYAm?d;!fMZ>0d7kQ3JDjGHUb^8lhmZHVfO(tr4{VU$3|;@S*4)4Ca9 zLjEg+mR-sA;gYBR5YV@Fy;kfC{8i9^8eG(Ss4`a$lN{^EZCOpO*B;8yBB(y+(lZx=&h*OcvF-ooWQS>Rd