Skip to content

Latest commit

 

History

History
121 lines (96 loc) · 7.2 KB

README.md

File metadata and controls

121 lines (96 loc) · 7.2 KB

Inferring a spatial code of cell-cell interactions across a whole animal body

How to cite:

Installation

All the analyses in this repository can be run on a CodeOcean capsule for reproducible results (estimated running time: 3:30 hr): https://doi.org/10.24433/CO.4688840.v2

If you are interested in running everything locally, follow the instructions below.

Installing Anaconda

Follow this tutorial to install Anaconda or Miniconda

Installing Github

Follow this tutorial to install Github

Installing cell2cell

Create a new conda environment:

conda create -n cell2cell -y python=3.7 jupyter

Activate that environment:

conda activate cell2cell

Then, install all dependencies:

pip install numba
pip install umap-learn
pip install 'matplotlib==3.2.0'
pip install 'cell2cell==0.6.0'
pip install git+https://github.com/BubaVV/Pyevolve
pip install tissue_enrichment_analysis
pip install matplotlib_venn
pip install 'xgboost==1.6.2'

Run/Explore the analyses

If the environment cell2cell is not active, activate it:

conda activate cell2cell

Then, for the respective analyses, a jupyter notebook is provided. Otherwise, instructions are detailed.

Jupyter notebooks can be open by executing the following command from the folder of this repository:

jupyter notebook

Main analyses

  • Analyses have to be run in the order below (although we provided the results of each step, so analyses can be run skipping previous steps) :
  1. Generate list of ligands-receptors interactions from orthologs. Then, a manual-curation is needed. This step can be skipped since we provided a manual curated list

  2. Compute intercellular distances and classify cell pairs by ranges of distances.

  3. Compute cell-cell interactions and communication for the curated ligand-receptor interactions.

  4. Run the genetic algorithm to select important ligand-receptor pairs for obtaining a better correlation between CCI scores and intercellular distance:

    • From the main directory of this repository, run:
      python ./code/genetic_algorithm.py -s bray_curtis -o GA-Bray-Curtis -r 100 -c 10
      

    *Note: This step can take between 1-2 days, depending on the number of iterations assigned in the nested for loops in the .py file. By default, it runs 100 times the GA, distributed in 10 cores (change -c 10 for another number of cores).

    This step can be skipped since we provided the results of 100 runs of the genetic algorithm using the Bray-Curtis score

  5. Examine results from the genetic algorithm, select ligand-receptor pairs and perform enrichment analysis on their functions.

  6. Compute cell-cell interactions and communication for the GA-selected ligand-receptor interactions.

  7. Perform permutation analyses on GA-selected ligand-receptor pairs.

  8. Evaluate active ligand-receptor interactions along the body of C. elegans. Assess enrichment/depletion.

  9. Evaluate enrichment/depletion of ligand-receptor pairs given their use in different ranges of distance.

  10. Evaluate enrichment of phenotypes on the genes in the GA-selected list of ligand-receptor pairs.

  11. Generate UMAP plots based on Jaccard distance of pairs of cells given active LR pairs.

  12. Run a similar analysis to the one in step 4, but this time using the LR Count score as the CCI score:

    • From the main directory of this repository, run:
      python ./Notebooks/genetic_algorithm.py -s count -o GA-LR-Count -r 100 -c 10
      

    *Note: This step can take between 1-2 days, depending on the number of iterations assigned in the nested for loops in the .py file. By default, it runs 100 times the GA, distributed in 10 cores (change -c 10 for another number of cores).

    This step can be skipped since we provided the results of 100 runs of the genetic algorithm using the LR count score

  13. Examine results from the genetic algorithm (LR count as CCI score), select ligand-receptor pairs and perform enrichment analysis on their functions.

  14. Run a similar analysis to the one in step 4, but this time using the ICELLNET score as the CCI score:

    • From the main directory of this repository, run:
      python ./Notebooks/genetic_algorithm.py -s icellnet -o GA-ICELLNET -r 100 -c 10
      

    *Note: This step can take between 1-2 days, depending on the number of iterations assigned in the nested for loops in the .py file. By default, it runs 100 times the GA, distributed in 10 cores (change -c 10 for another number of cores).

    This step can be skipped since we provided the results of 100 runs of the genetic algorithm using the ICELLNET score

  15. Examine results from the genetic algorithm (ICELLNET as CCI score), select ligand-receptor pairs and perform enrichment analysis on their functions.

  16. Compare GA-based selection of LR pairs by using Bray-Curtis score, LR Count score, or ICELLNET score

  17. Analyze spatial properties associated to the location type of each LR pair

Benchmarking analyses

  1. Generate CCI scores from Bray-Curtis, LR Count, and Smillie scoring functions
  2. Generate CCI scores from ICELLNET scoring function
  3. Generate CCI scores from CellChat scoring function
  4. Benchmarking of threshold values for binary-based methods
  5. Benchmarking of all CCI-scores - Classifiers for distinguishing distance range between cells

Disclaimer: Figures from the jupyter notebooks may differ from those in the paper, depending on the installed versions of the dependencies of the respective analyses. The same might happen with certain results that depends on external tools. For ensuring the figures look the same, use the CodeOcean capsule instead.