This code supplements the upcoming publication by Aguado Alvaro, Garitano, Esser-Skala et al.
(Not all of these folders appear in the git repository.)
data_generated
: output files generated by the scripts in this repositorydata_raw
: raw input datametadata
: additional required dataplots
: generated plotsrenv
: R environment datascripts_python
: Python scriptsscripts_r
: R scripts
Create a folder data_raw
that will contain raw data in the following subfolders:
rna
: DownloadGSE261783_RAW.tar
from GEO Series GSE261783 and extract all files.signatures
:markers_Forte.xlsx
: Table S3 from Forte et al (https://doi.org/10.1016/j.celrep.2020.02.008)markers_Koenig.xlsx
: Table S27 from Koenig et al (https://doi.org/10.1038/s44161-022-00028-6)markers_Buechler.xlsx
: Table S5 from Buechler et al (https://doi.org/10.1038/s41586-021-03549-5)
Optionally, obtain intermediary data: Extract the contents of data_generated.tgz
from Zenodo repository https://doi.org/XXX to folder data_generated
.
TBD
Run the following scripts in the folder scripts_r
in order to run the R analysis pipeline. utils.R
contains auxiliary functions and definitions required by several other scripts.
create_sce.R
: interface to Python scripts; creates a SingleCellObject with the same data and metadata as the AnnData objectplot_signatures.R
: fibroblast signatures (figures 3f, g and S3)plot_ko_distribution.R
: UMAPs with knockout distribution (figures 4a and S6)plot_ko_enrichment.R
: summary of knockout enrichment (figures 4b and S7)run_gsea.R
: perform gene set enrichment analysisplot_gsea.R
: plot GSEA results (figure 4d)plot_genes.R
: Volcano plot (figure 5f)plot_ntc_depletion.R
: NTC depletion (figure S7x)