Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early task describe-partialy work #10

Merged
merged 13 commits into from
Mar 25, 2024
121 changes: 96 additions & 25 deletions episodes/describe-cases.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,24 @@
exercises: 2
---

```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = FALSE)
library("outbreaks")
library("incidence2")
library("ggplot2")
library("i2extras")
```

:::::::::::::::::::::::::::::::::::::: questions
- How to convert case data to incidence?
- How to visualize incidence cases?
- What is the person, place, time distribution of cases?
- What is the growth rate of the epidemic?
<<<<<<< HEAD
- How do you visualize incidence cases?
- what is the growth rate of the epidemic?
- what is the person, place, time distribution of cases?
- what are the epidemiological characteristics of the infection?

::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 24 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag

Check warning on line 24 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag

::::::::::::::::::::::::::::::::::::: objectives
- Convert case data to incidence
Expand All @@ -23,53 +29,87 @@
- Create incidence curves
- Estimate the growth rate from incidence curves
- Create quick descriptive and comparison tables

::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 32 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag

## Introduction
A comprehensive description of data is pivotal for conducting insightful explanatory and exploratory analyses. This episode focuses describing and visualizing epidemic data, with a particular focus on a **Covid-19 case data from England**, which comes with the [outbreaks](http://www.reconverse.org/outbreaks/) package. The initial step involves reading this dataset, and we recommend utilizing the [readr](../links.md#readr) package for this purpose (or employing alternative methods as outlined in the [Read case data](../episodes/read-cases.Rmd)).

A comprehensive description of data is pivotal for conducting insightful explanatory and exploratory analyses. This episode focuses on describing and visualizing epidemic data. The examples are built around the **Covid-19 case data from England** dataset that is part of the [{outbreaks}](http://www.reconverse.org/outbreaks/) package. The initial step consists in reading this dataset, and we recommend utilizing the [{readr}](../links.md#readr) package for this purpose (or employing alternative methods as outlined in the [Read case data](../episodes/read-cases.Rmd)).

```{r, warning=FALSE, message=FALSE}
requireNamespace("outbreaks")
covid19_eng_case_data <- outbreaks::covid19_england_nhscalls_2020
utils::head(covid19_eng_case_data, 5)
head(covid19_eng_case_data)
```

## Incidence cases

Downstream analysis involves working with aggregated data rather than individual cases. The [incidence2]((https://www.reconverse.org/incidence2/articles/incidence2.html){.external target="_blank"}) package offers essential functions for grouping case data, usually centered around dated occurrences and/or other factors. The code chunk provided below demonstrates the creation of an `incidence2` object from the `covid19_eng_case_data` based on the date of sample.
Downstream analysis involves working with aggregated data rather than individual cases. The [{incidence2}]((https://www.reconverse.org/incidence2/articles/incidence2.html){.external target="_blank"}) package offers essential functionalities for grouping case data, usually centered around dated occurrences and/or other factors. The code chunk provided below demonstrates the creation of an `incidence2` object from the `covid19_eng_case_data` based on the date of sample.

```{r, message=FALSE, warning=FALSE}
requireNamespace("incidence2")
requireNamespace("ggplot2")
covid19_eng_incidence_data <- incidence2::incidence(covid19_eng_case_data,
date_index = "date")
utils::head(covid19_eng_incidence_data, 5)
covid19_eng_incidence_data <- incidence2::incidence(
covid19_eng_case_data,
date_index = "date"
)
head(covid19_eng_incidence_data)
```

The `incidence2` object can be visualized using the plot function of base R package.
The `incidence2` object can be visualized using the `plot()` function from base R package.

```{r, message=FALSE, warning=FALSE}
base::plot(covid19_eng_incidence_data) + labs(x = "Date", y = " Cases") +
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust = 1))
plot(covid19_eng_incidence_data)
```

Moreover, `Incidence2` can also aggregate case data based on a dated event and other factors such as what is the person and place. For example the code chunk groups weekly counts of Covid-19 cases in England based on `sex` type.
Moreover, {incidence2} has functionalities that allow for aggregating case data based on a dated event and other factors such as the individual gender, the sampling location, etc. In the example below, we calculate weekly counts of Covid-19 cases in England grouping them by `sex` type.

```{r, message=FALSE, warning=FALSE}
weekly_covid19_eng_incidence <- incidence2::incidence(covid19_eng_case_data,
date_index = "date",
interval = "week",
groups = "sex")
base::plot(weekly_covid19_eng_incidence) + labs(x = "Date", y = "Cases") +
theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))
weekly_covid19_eng_incidence <- incidence2::incidence(
covid19_eng_case_data,
date_index = "date",
interval = "week",
groups = "sex"
)
plot(weekly_covid19_eng_incidence, angle = 45)
```
A comprehensive description of data is pivotal for conducting insightful explanatory and exploratory analyses. This episode focuses describing and visualizing epidemic data, with a particular focus on a **Covid-19 case data from England**, which comes with the [outbreaks](http://www.reconverse.org/outbreaks/) package. The initial step involves reading this dataset, and we recommend utilizing the [readr](../links.md#readr) package for this purpose (or employing alternative methods as outlined in the [Read case data](../episodes/read-cases.Rmd)).

```{r, warning=FALSE, message=FALSE}
requireNamespace("outbreaks")
covid19_eng_case_data <- outbreaks::covid19_england_nhscalls_2020
utils::head(covid19_eng_case_data, 5)
```


::::::::::::::::::::::::::::::::::::: challenge

- Using the above `covid91_eng_case_data` dataset, produce monthly epi-curves for Covid-19 cases in England based on regional places in England?
- Using the above `covid91_eng_case_data` dataset, produce monthly epi-curves for Covid-19 cases in England based on regional places in England?

::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 85 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag


#### Analyzing the trend in case data

Aggregated case data over a specific time unit, or incidence data, typically represent the number of cases occurring within that time frame. These data can often be assumed to follow either a **Poisson** or a **negative binomial** distribution, depending on the specific characteristics of the data and the underlying processes generating them.

When analyzing such data, one common approach is to examine the trend over time by computing the **rate of change**, which can indicate whether there is exponential growth or decay in the number of cases. Exponential growth implies that the number of cases is increasing at an accelerating rate over time, while exponential decay suggests that the number of cases is decreasing at a decelerating rate.

Understanding the trend in case data is crucial for various purposes, such as forecasting future case counts, implementing public health interventions, and assessing the effectiveness of control measures. By analyzing the trend, policymakers and public health experts can make informed decisions to mitigate the spread of diseases and protect public health.

The {i2extras} package provides methods for modelling the trend in case data, calculating moving averages, and exponential growth or decay rate. The code chunk below computes the Covid-19 trend in UK within first 3 months using a negative binomial distribution.


```{r, warning=FALSE, message=FALSE}
# subset the covid19_eng_case_data to include only the first 3 months of data.
df <- subset(covid19_eng_case_data,
covid19_eng_case_data$date <= min(covid19_eng_case_data$date) + 90)

# compute the incidence data, grouping it by sex.
df_incid <- incidence2::incidence(df, date_index = "date", groups = "sex")

# use the fit_curve function from i2extras to fit a curve to the incidence data
fitted_curve <- i2extras::fit_curve(df_incid, model = "negbin", alpha = 0.05)

# plot the fitted curve
plot(fitted_curve, angle = 45)
```
#### Analyzing the trend in case data

Aggregated case data over a specific time unit, or incidence data, typically represent the number of cases occurring within that time frame. These data can often be assumed to follow either a Poisson distribution or a negative binomial distribution, depending on the specific characteristics of the data and the underlying processes generating them.
Expand Down Expand Up @@ -120,7 +160,7 @@
- Using `covid91_eng_case_data` dataset, model and visualize the trend of Covid-19 in England in the first six months cases via Poisson distribution?
- Determine the exponential growth or decay rate?

::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 163 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag

A moving average, which shows the trend of cases in specified time period, also can be calculate using the `add_rolling_average` function in `i2extras` package, as illustrated in the below code chunk.

Expand All @@ -135,7 +175,7 @@

- Calculate and visualize monthly moving average of Covid-19 cases in England?

::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 178 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag

Aggregated case data over a specific time unit, or incidence data, typically represent the number of cases occurring within that time frame. These data can often be assumed to follow either a Poisson distribution or a negative binomial distribution, depending on the specific characteristics of the data and the underlying processes generating them.

Expand Down Expand Up @@ -168,6 +208,22 @@

### Exponential growth or decay rate

The rate of exponential growth or decay can be extracted from the fitted curve via the `growth_rate()` function.

```{r, message=FALSE, warning=FALSE}
rates <- i2extras::growth_rate(fitted_curve)
rates <- base::as.data.frame(rates) |>
subset(select = c(sex, r, r_lower, r_upper))
```

### Peak time

The **Peak time ** is the date which the highest number of cases is observed. It can be estimated using the `i2extras::estimate_peak()` function as shown below:

```{r, message=FALSE, warning=FALSE}
peaks <- i2extras::estimate_peak(df_incid, progress = FALSE) |>
subset(select = -c(count_variable, bootstrap_peaks))
```
The rate of exponential growth or decay can be extracted from the fitted curve via `growth_rate` function.

```{r, message=FALSE, warning=FALSE}
Expand All @@ -186,7 +242,7 @@

### Moving average

A moving average, which shows the trend of cases in specified time period, also can be calculate using the `add_rolling_average` function in `i2extras` package, as illustrated in the below code chunk.
A moving average, which shows the trend of cases in a specified time period, can be calculated using the `add_rolling_average()` function in {i2extras} package, as illustrated below:

```{r, warning=FALSE, message=FALSE}
moving_Avg <- i2extras::add_rolling_average(df_incid, n = 7L)
Expand All @@ -198,6 +254,19 @@
::::::::::::::::::::::::::::::::::::: challenge

- What is the trend of cases in the above example, is it increasing or decreasing?
- Use the `covid91_eng_case_data` dataset for the first six months and perform the following:
- model and visualize the epi cure via Poisson distribution
- Determine the exponential growth or decay rate
- Estimate peak time
- Calculate and visualize monthly moving average

::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 263 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag


## Exponential growth

Within the renewal equation, the generation interval mechanistically links the reproductive number R to observables such as the epidemic growth rate r or the number of new infections per day [(Wallinga et al. 2006)](https://royalsocietypublishing.org/doi/10.1098/rspb.2006.3754)

- Using `covid91_eng_case_data` dataset for the first six months cases perform the following:
- model and visualize the epi cure via Poisson distribution?
- Determine the exponential growth or decay rate?
Expand All @@ -207,11 +276,13 @@
- Use `{i2extras}` to fit epi curve, calculate exponential growth or decline of cases, estimate pick size and peak time, and computing moving average of cases in specified time window.
- Use `{compareGroups}`

::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 279 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag

::::::::::::::::::::::::::::::::::::: keypoints

- Use `{incidence2}` to aggregate case data based on a date event.
- Use `{i2extras}` to fit epi curve, calculate exponential growth or decline of cases, find peak time, and computing moving average of cases in specified time window.
::::::::::::::::::::::::::::::::::::::::::::::::
- Use `{compareGroups}`

::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 287 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag
::::::::::::::::::::::::::::::::::::::::::::::::

Check warning on line 288 in episodes/describe-cases.Rmd

View workflow job for this annotation

GitHub Actions / Build markdown source files if valid

check for the corresponding open tag
6 changes: 1 addition & 5 deletions links.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,7 @@ any links that you are not going to use.
[r-markdown]: https://rmarkdown.rstudio.com/
[rstudio]: https://www.rstudio.com/
[carpentries-workbench]: https://carpentries.github.io/sandpaper-docs/
<<<<<<< HEAD
[readr]: https://readr.tidyverse.org/
[DHIS2]: https://dhis2.org/
[REDCap]: https://www.project-redcap.org/
[readepi]: https://epiverse-trace.github.io/readepi/

=======
[readr]: https://readr.tidyverse.org/
>>>>>>> dbdaefa (added some links)
Loading