HMMScan: Development and Application of a Data-Driven Signal Detection Method for Surveillance of Adverse Event Variability Across Manufacturing Lots of Biologics
This repository implements the HMMScan method described in "Development and Application of a Data-Driven Signal Detection Method for Surveillance of Adverse Event Variability Across Manufacturing Lots of Biologics".
All file paths in this section are relative to the top level of this directory.
- Install Python 3.9 and make sure this python version is active.
- Clone this repo and navigate to it locally.
- Create a virtual environment named
venv
:python -m venv venv
. It will be ignored by git. - Activate this virtual environment:
source venv/bin/activate
. - Install the required python packages from
requirements.txt
:python -m pip install -r requirements.txt
.
- Install
R
version 4.1. Make sure that this version is active (when called byR
from command line) if you have multiple R versions. - Run R from the command line:
R
. - Restore the virtual R environment by running the following command in
R
:renv::restore()
.
Warning: this last step may take a long time, but you only need to run it once.
- Download the
ae-project
repository locally. - Create
shared-path.txt
(in the top-level of this directory) that contains only the absolute local path of theae-project
repository (e.g.,/Users/username/ae-project
). Do not include a new line character after the path.
The Mendeley Data directory contains intermediate numerical results required to recreate all figures in the paper. To replicate the paper figures only, please refer to the Paper Figures section below.
To recreate the intermediate results starting from the raw input data, follow the steps in the associated readme files referenced below.
The file paths referenced below assumes the ae-project
directory is downloaded to a directory called ae-project
.
Refer to Use Case Scanner Documentation.
Refer to Simulation Validation Documentation.
Refer to Use Case Validation Documentation.
Refer to Use Case Validation Documentation.
Refer to the New Use Case Documentation to ensure that the raw data is formatted correctly. Then, refer to the documentation files referenced above to execute HMMScan.
This section provides references for the scripts and files that are used to generate the figures in the paper and supplementary material.
All scripts are found in hmmscan/scripts/viz
.
- Figure 2:
s2c1-sim-results.R
- Figure 3:
bic.R
- Figure 4:
best_model_dists_and_predictions.R
- Figure 5:
best_model_dists_and_predictions.R
- Figure 6:
use_case_validation.R
- Online Resource 1, Table S1:
use_case_validation.R
- Online Resource 1, Figure S1 & Table S2:
s1-sim-results.R
- Online Resource 1, Table S3 & Figure S2:
s2c1-sim-results.R
- Online Resource 1, Figure S3:
s2c2-sim-results.R
- Online Resource 1, Figure S4 & Table S4:
s3-split-high-sim-results.R
- Online Resource 1, Figure S5 & Table S5:
s3-split-low-sim-results.R
- Online Resource 1, Figure S6 & Table S6:
s3-graduated-sim-results.R
- Online Resource 1, Figure S7 & Table S7:
s3-equal-state-sim-results.R
- Online Resource 1, Figure S8 & Table S8:
s4-sim-results.R
- Online Resource 1, Section S5 (Figure S10-12, Table S9):
bic_ceiling.R
,best_model_dists_and_predictions.R
,ci.R
- Online Resource 1, Section S6 (Figure S13, Table S10):
best_model_dists_and_predictions.R
- Online Resource 1, Section S7 (Figure S14-16, Table S11): See
hmmscan-pytorch
repo here.
- Table 3:
summary_stats.R
- Table 4:
best_model_dists_and_predictions.R
andci.R