README.Rmd

---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# idbrms: Infectious Disease Modelling using brms

[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)
![R-CMD-check](https://github.com/epiforecasts/idbrms/workflows/R-CMD-check/badge.svg)
[![Codecov test coverage](https://codecov.io/gh/epiforecasts/brms.ide/branch/master/graph/badge.svg)](https://codecov.io/gh/epiforecasts/brms.ide?branch=master)

Provides population-level infectious disease models as an extension of `brms`.

## Installation

You can install the unstable development version from [GitHub](https://github.com/) with:

```{r, eval = FALSE}
# install.packages("devtools")
devtools::install_github("epiforecasts/idbrms")
```

## Example

* Load `idbrms`, `brms`, `data.table` (used for manipulating data), and `ggplot2` for visualisation.

```{r packages, message = FALSE}
library(idbrms)
library(brms)
library(data.table)
library(ggplot2)
```

* Define simulated data containing cases of some infectious disease over time in a single region (with an initial growth of 20% a day followed by a decline of 10% a day). We then assume that deaths are a convolution of cases using a log normal distribution with a log mean of 1.6 (standard deviation 0.2) and a log standard deviation of 0.8 (standard deviation 0.2), and are then scaled by a fraction (here 0.4 (standard deviation 0.025)).

```{r toy-data}
# apply a convolution of a log normal to a vector of observations
weight_cmf <- function(x, ...) {
  set.seed(x[1])
   meanlog <- rnorm(1, 1.6, 0.1)
   sdlog <- rnorm(1, 0.6, 0.025)
   cmf <- (cumsum(dlnorm(1:length(x), meanlog, sdlog)) -
     cumsum(dlnorm(0:(length(x) - 1), meanlog, sdlog)))
   conv <- sum(x * rev(cmf), na.rm = TRUE)
   conv <- rpois(1, round(conv, 0))
  return(conv)
}

obs <- data.table(
  region = "Glastonbury", 
  cases = as.integer(c(10 * exp(0.15 * 1:50), 
                       10 * exp(0.15 * 50) * exp(-0.1 * 1:50))),
  date = seq(as.Date("2020-10-01"), by = "days", length.out = 100))
# roll over observed cases to produce a convolution
obs <- obs[, deaths := frollapply(cases, 15, weight_cmf, align = "right")]
obs <- obs[!is.na(deaths)]
obs <- obs[, deaths := round(deaths * rnorm(.N, 0.25, 0.025), 0)]
obs <- obs[deaths < 0, deaths := 0]
```

* Visual simulated data (columns are cases and points are deaths).

```{r}
ggplot(obs) +
  aes(x = date, y = cases) +
  geom_col(fill = "lightgrey") +
  geom_point(aes(y = deaths)) +
  theme_minimal()
```

* Prepare the data to be fit using the convolution model. 

```{r prep-data}
prep_obs <- prepare(obs, model = "convolution", location = "region",
                    primary = "cases", secondary = "deaths", max_convolution = 15)
head(prep_obs, 10)
```

* Fit the model assuming a Poisson observation model (*It is important to use an identity link here as `idbrm` provides its own link by default*).

```{r fit-model-eval, include = FALSE}
fit <- idbrm(data = prep_obs, family = poisson(link = "identity"))
```

```{r fit-model, eval = FALSE}
fit <- idbrm(data = prep_obs, family = poisson(link = "identity"))
```

* Summarise fit.

```{r summarise-fit}
summary(fit)
```

* Explore estimated effect sizes (we approximately should recover those used for simulation).

```{r}
exp(posterior_summary(fit, "scale_Intercept")) / 
  (1 + exp(posterior_summary(fit, "scale_Intercept")))
posterior_summary(fit, "cmean_Intercept")
exp(posterior_summary(fit, "lcsd_Intercept"))
```

* Expose model stan functions.

```{r expose-fns-eval, eval = FALSE}
expose_functions(fit)
```

```{r expose-fns, include = FALSE}
expose_functions(fit)
```

* Test central estimates returns something sensible compared to observed data.

```{r}
n_obs <- length(prep_obs$primary)
fixed <- summary(fit)$fixed
pt_ests <- fixed[, 1]
names(pt_ests) <- rownames(fixed)
p_primary <- with(prep_obs, idbrms_convolve(primary, rep(pt_ests["scale_Intercept"], n_obs), 
                                            rep(pt_ests["cmean_Intercept"], n_obs),
                                            rep(pt_ests["lcsd_Intercept"], n_obs), 
                                            cmax, index, cstart, init_obs))
ggplot(prep_obs) + 
  aes(x = date, y = secondary) +
  geom_col(fill = "lightgrey") +
  geom_point(aes(y = p_primary)) +
  theme_minimal()
```

* Posterior predictive check.

```{r pred-check}
pp_check(fit)
```

* Plot conditional effects (fails with due to mismatched vector sizes in the stan dot product + there are no variables in this model so should always fail).

```{r effect-plot, eval = FALSE}
plot(conditional_effects(fit), ask = FALSE)
```