Skip to content

Commit

Permalink
add edit suggestion to simple-analysis
Browse files Browse the repository at this point in the history
  • Loading branch information
avallecam authored Jun 17, 2024
1 parent 6b72101 commit 0fab05e
Showing 1 changed file with 64 additions and 13 deletions.
77 changes: 64 additions & 13 deletions episodes/simple-analysis.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,19 @@ exercises: 10

Understanding the trend in case data is crucial for various purposes, such as forecasting future case counts, implementing public health interventions, and assessing the effectiveness of control measures. By analyzing the trend, policymakers and public health experts can make informed decisions to mitigate the spread of diseases and protect public health. This episode focuses on how to perform a simple early analysis on incidence data. It uses the same dataset of **Covid-19 case data from England** that utilized it in [Aggregate and visualize](../episodes/describe-cases.Rmd) episode.

::::::::::::::::::: checklist

### The double-colon

The double-colon `::` in R let you call a specific function from a package without loading the entire package into the current environment.

For example, `dplyr::filter(data, condition)` uses `filter()` from the `{dplyr}` package.

This help us remember package functions and avoid namespace conflicts.

:::::::::::::::::::


## Simple model

Aggregated case data over a specific time unit, or incidence data, typically represent the number of cases occurring within that time frame. These data can often be assumed to follow either `Poisson distribution` or a `negative binomial (NB) distribution`, depending on the specific characteristics of the data and the underlying processes generating them. When analyzing such data, one common approach is to examine the trend over time by computing the rate of change, which can indicate whether there is exponential growth or decay in the number of cases. Exponential growth implies that the number of cases is increasing at an accelerating rate over time, while exponential decay suggests that the number of cases is decreasing at a decelerating rate.
Expand All @@ -31,24 +44,39 @@ The `i2extras` package provides methods for modelling the trend in case data, ca


```{r, warning=FALSE, message=FALSE}
# loads the i2extras package, which provides methods for modeling
# load packages which provides methods for modeling
library("i2extras")
# This line loads the i2extras package, which provides methods for modeling
library("incidence2")
# subset the covid19_eng_case_data to include only the first 3 months of data
# read data from {outbreaks} package
covid19_eng_case_data <- outbreaks::covid19_england_nhscalls_2020
# subset the covid19_eng_case_data to include only the first 3 months of data
df <- base::subset(
covid19_eng_case_data,
covid19_eng_case_data$date <= min(covid19_eng_case_data$date) + 90
)
# uses the incidence function from the incidence2 package to compute the
# incidence data
df_incid <- incidence2::incidence(df, date_index = "date", groups = "sex")
df_incid <- incidence2::incidence(
df,
date_index = "date",
groups = "sex"
)
# fit a curve to the incidence data. The model chosen is the negative binomial
# distribution with a significance level (alpha) of 0.05.
fitted_curve_nb <- i2extras::fit_curve(df_incid, model = "negbin", alpha = 0.05)
base::plot(fitted_curve_nb, angle = 45) + ggplot2::labs(x = "Date", y = "Cases")
fitted_curve_nb <-
i2extras::fit_curve(
df_incid,
model = "negbin",
alpha = 0.05
)
# plot fitted curve
base::plot(fitted_curve_nb, angle = 45) +
ggplot2::labs(x = "Date", y = "Cases")
```


Expand All @@ -61,8 +89,13 @@ Repeat the above analysis using Poisson distribution?
:::::::::::::::::::::::: solution

```{r, warning=FALSE, message=FALSE}
fitted_curve_poisson <- i2extras::fit_curve(df_incid, model = "poisson",
alpha = 0.05)
fitted_curve_poisson <-
i2extras::fit_curve(
x = df_incid,
model = "poisson",
alpha = 0.05
)
base::plot(fitted_curve_poisson, angle = 45) +
ggplot2::labs(x = "Date", y = "Cases")
```
Expand Down Expand Up @@ -99,6 +132,7 @@ The **Peak time ** is the time at which the highest number of cases is observed
```{r, message=FALSE, warning=FALSE}
peaks_nb <- i2extras::estimate_peak(df_incid, progress = FALSE) |>
subset(select = -c(count_variable, bootstrap_peaks))
base::print(peaks_nb)
```

Expand All @@ -109,10 +143,17 @@ A moving or rolling average calculates the average number of cases within a spec

```{r, warning=FALSE, message=FALSE}
library("ggplot2")
moving_Avg_week <- i2extras::add_rolling_average(df_incid, n = 7L)
base::plot(moving_Avg_week, border_colour = "white", angle = 45) +
ggplot2::geom_line(ggplot2::aes(x = date_index, y = rolling_average,
color = "red")) +
ggplot2::geom_line(
ggplot2::aes(
x = date_index,
y = rolling_average,
color = "red"
)
) +
ggplot2::labs(x = "Date", y = "Cases")
```

Expand All @@ -126,9 +167,19 @@ Compute and visualize the monthly moving average of cases on `df_incid`?

```{r, warning=FALSE, message=FALSE}
moving_Avg_mont <- i2extras::add_rolling_average(df_incid, n = 30L)
base::plot(moving_Avg_mont, border_colour = "white", angle = 45) +
ggplot2::geom_line(ggplot2::aes(x = date_index, y = rolling_average,
color = "red")) +
base::plot(
moving_Avg_mont,
border_colour = "white",
angle = 45
) +
ggplot2::geom_line(
ggplot2::aes(
x = date_index,
y = rolling_average,
color = "red"
)
) +
ggplot2::labs(x = "Date", y = "Cases")
```

Expand Down

0 comments on commit 0fab05e

Please sign in to comment.