Skip to content

full review post-trial 01 #53

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 4, 2024
10 changes: 10 additions & 0 deletions episodes/delays-functions.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,16 @@ library(tidyverse)
withr::local_options(list(mc.cores = 4))
```

::::::::::::::::::: checklist

### The double-colon

The double-colon `::` in R is used to access functions or objects from a specific package without loading the entire package into the current environment. This allows for a more targeted approach to using package components and helps avoid namespace conflicts.

`::` lets you call a specific function from a package by explicitly mentioning the package name. For example, `dplyr::filter(data, condition)` uses `filter()` from the `{dplyr}` package without loading the entire package.

:::::::::::::::::::

## Distribution functions

In R, all the statistical distributions have functions to access the following:
Expand Down
10 changes: 10 additions & 0 deletions episodes/delays-reuse.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,16 @@ epiparameter::epidist_db(
)
```

::::::::::::::::::: checklist

### The double-colon

The double-colon `::` in R is used to access functions or objects from a specific package without loading the entire package into the current environment. This allows for a more targeted approach to using package components and helps avoid namespace conflicts.

`::` lets you call a specific function from a package by explicitly mentioning the package name. For example, `dplyr::filter(data, condition)` uses `filter()` from the `{dplyr}` package without loading the entire package.

:::::::::::::::::::

From the `{epiparameter}` package, we can use the `epidist_db()` function to ask for any `disease` and also for a specific epidemiological distribution (`epi_dist`).

Let's ask now how many parameters we have in the epidemiological distributions database (`epidist_db`) with the generation time using the string `generation`:
Expand Down
125 changes: 54 additions & 71 deletions episodes/quantify-transmissibility.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,6 @@ teaching: 30
exercises: 0
---

```{r setup, echo = FALSE, warning = FALSE, message = FALSE}
library(EpiNow2)
library(ggplot2)
withr::local_options(list(mc.cores = 4))
```

:::::::::::::::::::::::::::::::::::::: questions

- How can I estimate the time-varying reproduction number ($Rt$) and growth rate from a time series of case data?
Expand Down Expand Up @@ -56,6 +50,29 @@ To estimate these key metrics using case data we must account for delays between

In the next tutorials we will focus on how to use the functions in `{EpiNow2}` to estimate transmission metrics of case data. We will not cover the theoretical background of the models or inference framework, for details on these concepts see the [vignette](https://epiforecasts.io/EpiNow2/dev/articles/estimate_infections.html).

In this tutorial we are going to learn how to use the `{EpiNow2}` package to estimate the time-varying reproduction number. We’ll use the `{dplyr}` package to arrange some of its inputs, `{ggplot2}` to visualize case distribution, and the pipe `%>%` to connect some of their functions, so let’s also call to the `{tidyverse}` package:

```{r,message=FALSE,warning=FALSE}
library(EpiNow2)
library(tidyverse)
```

::::::::::::::::::: checklist

### The double-colon

The double-colon `::` in R is used to access functions or objects from a specific package without loading the entire package into the current environment. This allows for a more targeted approach to using package components and helps avoid namespace conflicts.

`::` lets you call a specific function from a package by explicitly mentioning the package name. For example, `dplyr::filter(data, condition)` uses `filter()` from the `{dplyr}` package without loading the entire package.

:::::::::::::::::::

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor

This tutorial illustrates the usage of `epinow()` to estimate the time-varying reproduction number and infection times. Learners should understand the necessary inputs to the model and the limitations of the model output.

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


::::::::::::::::::::::::::::::::::::: callout
### Bayesian inference
Expand All @@ -78,19 +95,6 @@ In the ["`Expected change in daily cases`" callout](#expected-change-in-daily-ca
::::::::::::::::::::::::::::::::::::::::::::::::


The first step is to load the `{EpiNow2}` package:

```{r, eval = FALSE}
library(EpiNow2)
```

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: instructor

This tutorial illustrates the usage of `epinow()` to estimate the time-varying reproduction number and infection times. Learners should understand the necessary inputs to the model and the limitations of the model output.

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


## Delay distributions and case data
### Case data

Expand Down Expand Up @@ -303,7 +307,7 @@ The function `epinow()` is a wrapper for the function `estimate_infections()` us
There are numerous other inputs that can be passed to `epinow()`, see `EpiNow2::?epinow()` for more detail.
One optional input is to specify a log normal prior for the effective reproduction number $R_t$ at the start of the outbreak. We specify a mean and standard deviation as arguments of `prior` within `rt_opts()`:

```{r, eval = FALSE}
```{r, eval = TRUE}
rt_log_mean <- convert_to_logmean(2, 1)
rt_log_sd <- convert_to_logsd(2, 1)
rt <- rt_opts(prior = list(mean = rt_log_mean, sd = rt_log_sd))
Expand All @@ -324,72 +328,52 @@ To find the maximum number of available cores on your machine, use `parallel::de

::::::::::::::::::::::::::::::::::::::::::::::::

```{r, echo = FALSE}
rt_log_mean <- convert_to_logmean(2, 1)
rt_log_sd <- convert_to_logsd(2, 1)
::::::::::::::::::::::::: checklist

incubation_period_fixed <- dist_spec(
mean = 4, sd = 2,
max = 20, distribution = "gamma"
**Note:** In the code below `_fixed` distributions are used instead of `_variable` (delay distributions with uncertainty). This is to speed up computation time. It is generally recommended to use variable distributions that account for additional uncertainty.

```{r, echo = TRUE}
# fixed alternatives
generation_time_fixed <- dist_spec(
mean = 3.6, sd = 3.1,
max = 20, distribution = "lognormal"
)

log_mean <- convert_to_logmean(2, 1)
log_sd <- convert_to_logsd(2, 1)
reporting_delay_fixed <- dist_spec(
mean = log_mean, sd = log_sd,
max = 10, distribution = "lognormal"
)

generation_time_fixed <- dist_spec(
mean = 3.6, sd = 3.1,
max = 20, distribution = "lognormal"
)
```

*Note: in the code below fixed distributions are used instead of variable. This is to speed up computation time. It is generally recommended to use variable distributions that account for additional uncertainty.*

::::::::::::::::::::::::::::::::: spoiler

### On reducing computation time

Using an appropriate number of samples and chains is crucial for ensuring convergence and obtaining reliable estimates in Bayesian computations using Stan. Inadequate sampling or insufficient chains may lead to issues such as divergent transitions, impacting the accuracy and stability of the inference process.
:::::::::::::::::::::::::

For the purpose of this tutorial, we can add more configuration details to get an useful output in less time. You can specify a fixed number of `samples` and `chains` to the `stan` argument using the `stan_opts()` function:
Now you are ready to run `EpiNow2::epinow()` to estimate the time-varying reproduction number:

The code in the proposed code chunk can take around 10 minutes. We expect this alternative code chunk below using `stan_opts()` to take approximately 3 minutes:
```{r, message = FALSE, eval = TRUE}
reported_cases <- cases[1:90, ]

```{r,eval=FALSE}
estimates <- epinow(
# same code as previous chunk
# cases
reported_cases = reported_cases,
# delays
generation_time = generation_time_opts(generation_time_fixed),
delays = delay_opts(
incubation_period_fixed + reporting_delay_fixed
),
rt = rt_opts(
prior = list(mean = rt_log_mean, sd = rt_log_sd)
),
# [new] set a fixed number of samples and chains
delays = delay_opts(incubation_period_fixed + reporting_delay_fixed),
# prior
rt = rt_opts(prior = list(mean = rt_log_mean, sd = rt_log_sd)),
# computation (optional)
stan = stan_opts(samples = 1000, chains = 3)
)
```

:::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::: callout

```{r, message = FALSE, eval = TRUE}
reported_cases <- cases[1:90, ]
### Do not wait for this to continue

estimates <- epinow(
reported_cases = reported_cases,
generation_time = generation_time_opts(generation_time_fixed),
delays = delay_opts(
incubation_period_fixed + reporting_delay_fixed
),
rt = rt_opts(
prior = list(mean = rt_log_mean, sd = rt_log_sd)
)
)
```
Using `stan = stan_opts()` is optional. For the purpose of this tutorial on reducing computation time, we specified a fixed number of `samples = 1000` and `chains = 3` to the `stan` argument using the `stan_opts()` function. We expect this to take approximately 3 minutes.

**Remember:** Using an appropriate number of *samples* and *chains* is crucial for ensuring convergence and obtaining reliable estimates in Bayesian computations using Stan. Inadequate sampling or insufficient chains may lead to issues such as divergent transitions, impacting the accuracy and stability of the inference process.

:::::::::::::::::::::::::::::::::

### Results

Expand Down Expand Up @@ -466,14 +450,13 @@ To find regional estimates, we use the same inputs as `epinow()` to the function

```{r, message = FALSE, eval = TRUE}
estimates_regional <- regional_epinow(
# cases
reported_cases = regional_cases,
# delays
generation_time = generation_time_opts(generation_time_fixed),
delays = delay_opts(
incubation_period_fixed + reporting_delay_fixed
),
rt = rt_opts(
prior = list(mean = rt_log_mean, sd = rt_log_sd)
)
delays = delay_opts(incubation_period_fixed + reporting_delay_fixed),
# prior
rt = rt_opts(prior = list(mean = rt_log_mean, sd = rt_log_sd))
)

estimates_regional$summary$summarised_results$table
Expand Down
86 changes: 85 additions & 1 deletion learners/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,11 +112,95 @@ new_packages <- c(
"tidyverse"
)

pak::pak(new_packages)
pak::pkg_install(new_packages)
```

These installation steps could ask you `? Do you want to continue (Y/n)` write `Y` and press <kbd>Enter</kbd>.

::::::::::::::::::::::::::::: spoiler

### do you get an error with EpiNow2?

Windows users will need a working installation of `Rtools` in order to build the package from source. `Rtools` is not an R package, but a software you need to download and install. We suggest you to follow:

<!-- reference [these steps](http://jtleek.com/modules/01_DataScientistToolbox/02_10_rtools/#1) -->

1. **Verify `Rtools` installation**. You can do so by using Windows search across your system. Optionally, you can use `{devtools}` running:

```r
if(!require("devtools")) install.packages("devtools")
devtools::find_rtools()
```

If the result is `FALSE`, then you should do step 2.

2. **Install `Rtools`**. Download the `Rtools` installer from <https://cran.r-project.org/bin/windows/Rtools/>. Install with default selections.

3. **Verify `Rtools` installation**. Again, we can use `{devtools}`:

```r
if(!require("devtools")) install.packages("devtools")
devtools::find_rtools()
```

:::::::::::::::::::::::::::::


::::::::::::::::::::::::::::: spoiler

### do you get an error with epiverse-trace packages?

If you get an error message when installing {epiparameter}, try this alternative code:

```r
# for epiparameter
install.packages("epiparameter", repos = c("https://epiverse-trace.r-universe.dev"))
```

:::::::::::::::::::::::::::::

::::::::::::::::::::::::::: spoiler

### What to do if an Error persist?

If the error message keyword include an string like `Personal access token (PAT)`, you may need to [set up your GitHub token](https://epiverse-trace.github.io/git-rstudio-basics/02-setup.html#set-up-your-github-token).

First, install these R packages:

```r
if(!require("pak")) install.packages("pak")

new <- c("gh",
"gitcreds",
"usethis")

pak::pak(new)
```

Then, follow these three steps to [set up your GitHub token (read this step-by-step guide)](https://epiverse-trace.github.io/git-rstudio-basics/02-setup.html#set-up-your-github-token):

```r
# Generate a token
usethis::create_github_token()

# Configure your token
gitcreds::gitcreds_set()

# Get a situational report
usethis::git_sitrep()
```

Try again installing {epiparameter}:

```r
if(!require("remotes")) install.packages("remotes")
remotes::install_github("epiverse-trace/epiparameter")
```

If the error persist, [contact us](#your-questions)!

:::::::::::::::::::::::::::

You should update **all of the packages** required for the tutorial, even if you installed them relatively recently. New versions bring improvements and important bug fixes.

When the installation has finished, you can try to load the packages by pasting the following code into the console:
Expand Down