This repository contains code to run the inference framework that estimates time-varying ascertainment rates of COVID-19 cases (Russell et al.). To do so, we use a Gaussian Process modelling framework, fit to the confirmed COVID-19 death time series for the country or region in question (see Russell et al. for more details on the methods and limitations involved).
To run the code, first of all clone this repository, using the command
git clone https://github.com/thimotei/CFR_calculation
The time-varying estimates result from fitting a Guassian Process model, which
is implemented in the R libraries greta
and greta.gp
.
These need to be run from a virtual environment, which is taken care of in the
script the model is run from. Specifically, the user needs to run the following
commands to ensure the necessary packages are installed
install.packages(c("reticulate", "greta", "greta.gp"))
reticulate
is required for a virtual environment to
python
, as greta
requires a virtual environment, as
it uses tensorflow
called from this virtual environment.
The user therefore needs to install the correct version of
tensorflow
for greta
. This is done from R with the
following commands (the same commands are in the main script, but commented out
and need only to be run once):
library(reticulate)
use_condaenv('r-reticulate', required = TRUE)
library(greta)
library(greta.gp)
greta::install_tensorflow(method = "conda",
version = "1.14.0",
extra_packages = "tensorflow-probability==0.7")
Once the user has installed tensorflow
, they can run the model
from within the script
scripts/main_script_GP.R
which runs the model for a single country or region, specified by the 3-letter iso-code. The script downloads the latest data from Johns Hopkins COVID-19 dataset here and munges the data into the correct format using this function
R/jhu_data_import.R
To run the model at scale, a HPC is used, using the scripts found in
hpc_scripts