Skip to content

turbine-ai/PerturbSeqPredBenchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Benchmarking a foundational cell model for post-perturbation RNAseq prediction

This is the official Github repository of the paper Benchmarking a foundational cell model for post-perturbation RNAseq prediction. This is a fork of the scGPT repository.

Repository content

  • notebooks/

    • bulk_models.ipynb: trains RF, Elastic Net, KNN Regressor and Train Mean with GO, scGPT, scFoundation and scElmo features (Figure 1 B - E, 2 D)
    • data_analysis.ipynb: runs data analysis and generates Figure 2 A - C
    • Tutorial_PerturbationAdamson.ipynb: trains scGPT on the Adamson et al. dataset
    • Tutorial_PerturbationNorman.ipynb: trains scGPT on the Norman et al. dataset
    • Tutorial_PerturbationReplogle.ipynb: trains scGPT on the Replogle et al. (K562) dataset
    • Tutorial_PerturbationReplogleRPE1.ipynb: trains scGPT on the Replogle et al. (RPE1) dataset
    • embedding_eval.ipynb: embedding analysis
  • scFoundation training entry points at scFoundation/GEARS:

    • train_adamson.py: trains on the Adamson et al. dataset
    • train_norman.py: trains on the Norman et al. dataset
    • train_replogle_rp1.py: trains on the Replogle et al. (K562) dataset
    • train_replogle.py: trains on the Replogle et al. (RPE1) dataset

Reproducibility

To reproduce the results of the paper, please follow the following steps:

  1. Run git lfs pull to download the required data from Git Large File System. If lfs is not installed, pleaser refer to this guide

  2. Run make setup to create the conda environment, install the ipython kernel and unzip the replogle dataset

  3. Run scGPT trainings

    1. Select the scgpt_yml conda environment as the Python kernel for the notebooks
    2. Run the Tutorial notebooks to get the results of scGPT
  4. Run scFoundation trainings

    1. Create the conda environment for scFoundation by running conda create env -f scFoundation/conda.yaml
    2. Run conda activate scfoundation
    3. Start the trainings via the entry points
  5. Run data_analysis.ipynb

  6. Run bulk_models.ipynb

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published