Skip to content

Latest commit

 

History

History

analysis-combined

scripts/analysis-combined/

This repository contains scripts to perform statistical analyses on combined datasets. These scripts were used to make the figures in the paper.

Note: we strongly recommend to run 04a_LogRatios-Taxa.R, the scCODA/AGP part of10_DA-analysis/DA_analysis_run.ipynb and the scCODA part of 10_DA-analysis/shared_ASV_run.ipynb on a high performance computing cluster. A bash script to execute the logratio analysis is provided in the corresponding data directory. Code for executing the differential abundance analysis in a distributed fashion is provided in the data folder.

Script Paper figure(s) Short description
01_TaxonomicTree.R Fig1A, FigS2 Taxonomic tree of all ASVs inferred across datasets
02_LogRatio-FirmBact.R Fig2, FigS5 Log-ratio of Firmicutes:Bacteroidota abundance in healthy vs IBS samples
03_Heatmaps.R Fig3A, FigS6A Heatmap of microbial families relative abundances
04a_LogRatios-Taxa.R Fig3B, FigS6B Compute log-ratio between all combinations of microbial families, and save the sample x log-ratio dataframe (to be provided as input for UMAP)
04b_UMAP.R Fig3B, FigS6B UMAP of log-ratios between microbial families across datasets
05_Common-ASVs.R Fig5A Find how many ASVs are identical across datasets (expectation is to find common ASVs between datasets that amplified the same variable regions)
06_QCplot.R FigS1 Plot number of reads per sample before/after quality filtering with DADA2 preprocessing
07_RelativAbund.R FigS3 Plot relative abundance of 5 main phyla across datasets
08_AlphaDiversity.R FigS4 Shannon and Simpson α-diversity indexes in healthy vs IBS samples
09_PCoA-BrayCurtis-BigDatasets.R FigS7 Compute Bray-Curtis dissimilarity in AGP, Pozuelo and Hugerth datasets (3 biggest datasets) and perform PCoA
10_DA-analysis/sccoda_reference_finding.ipynb - Find a suited reference taxon for running scCODA in the other scripts of this chapter
10_DA-analysis/DA_analysis_run.ipynb - Differential abundance analysis of all datasets individually (run models and create intermediate results)
10_DA-analysis/DA_analysis_individual_data.ipynb Fig4, FigS8, FigS9, Table2, TableS5 Differential abundance analysis of all datasets individually (result analysis)
10_DA-analysis/shared_ASV_run.ipynb - Differential abundance analysis of shared ASVs between Nagel and Pozuelo datasets (preparation for next chapter)
11_shared-classification-analysis/11a_shared_classification.R - Classification analysis of shared ASVs between Nagel and Pozuelo datasets
11_shared-classification-analysis/11b_additional_plots.R - Create supporting figures for the shared ASV analysis
11_shared-classification-analysis/11c_combine_plots.R Fig5 Combine the figures from the previous scripts in the folder
12_composiitonal_mean_test.R Employ compositional mean test for IBS/healthy within datasets and similarity across datasets.