A framework created using already available tools and databases, to perform cellular reprogramming computationally. To carry out tool specific task using the various tools available separately, the researchers need to go through the tools individually. CRGEM ease their work by integrating the tools at one place. With this integration of the functionalities researchers can invest time in biological inferences and experimentally verifying the key modulators and their effects.
- Install all R requirements
- git clone using installation command mentioned below
- cd into pipeline directory formed with clone command
- run the setup.sh file
- Install the feather file in data directory using the command mentioned below (Point 8)
- Run the commands as per the requirement. (Refer commands section given below)
git clone --depth 1 [email protected]:Avani7/Pipeline.git pipeline
Use this command ro install R requirements
install.packages(c("gtools","Matrix", "nibble","dplyr","stringr","purrr","Rcpp","reshape2","umap","pheatmap", "igraph","GGally","ggplot2","RcisTarget","AUCell"))
4. Cytoscape version recommended: 3.9.1. Cytoscape needs to open in the background while running the workflow.
brew install boost
User needs to define the CAPTH and LD_LIBRARY_PATH
brew install wget
User needs to download the feature file from https://resources.aertslab.org/cistarget/databases/old/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/ and add it in the data folder.
curl -O https://resources.aertslab.org/cistarget/databases/old/homo_sapiens/hg19/refseq_r45/mc9nr/gene_based/hg19-tss-centered-10kb-10species.mc9nr.feather
The file is already present in the data directory. If the user wants to download the latest data, it can be done using curl -s 'https://www.grnpedia.org/trrust/data/trrust_rawdata.human.tsv' >> trrust_rawdata_human.tsv
command. User needs to make sure that the downloaded file is in the data directory.
1. artefacts: Directory provided by user, where all the results would be saved.
2. stage: Part of the tool user wants to run.
3. params: Input arguments required by the stage.
All the input files should be saved in a folder called data. Inputs files user needs to create: Gene expression files
- Starting cell population: create a .txt file with gene expression data of starting cell population. The rows should be gene names and columns should be sample IDs. Eg: start.txt
- Starting cell and terminal cell population combined: create a .txt file with gene expression data of both starting cell and terminal cell population. The rows should be gene names and columns should be sample IDs. Eg: start_terminal.txt
- Terminal cell population: create a .csv file with gene expression data of starting cell population. The rows should be gene names and columns should be sample IDs. Eg: terminal.csv
- annotation.txt: create a .txt file with lable IDs of the samples from starting cell and terminal cell population combined expression data. First row being the sample IDS similar to the starting cell and terminal cell population combined file and second two should be the cluster IDs of the population. Also one of the sample ID and cluster ID should be matching.
The starting cell population and terminal cell population cluster IDs to be enerted as parameters should match the one in the annotations files.
-
stage: all (TransSynW + PAGA + SIGNET + TRRUST + Cytoscape + Uniprot)
crgem run all --artefacts ./artefacts/[directory_name] --params [start_cell population] [start and terminal_cell population] [annotation file] [terminal cell cluster ID] [startaing cell cluster ID] ./data/terminal.csv ./data/trrust_rawdata_human.tsv
Eg: crgem run all --artefacts ./artefacts/temp --params start.txt start_terminal.txt annotation.txt HPROGFPM HNES ./data/terminal.csv ./data/trrust_rawdata_human.tsv
-
stage: generate_hypothesis (TransSynW)
crgem run generate_hypothesis --artefacts ./artefacts/[directory_name] --params [start_cell population] [start and terminal_cell population] [annotation file] [terminal cell cluster ID]
Eg: crgem run generate_hypothesis --artefacts ./artefacts/[directory_name] --params start.txt start_terminal.txt annotation.txt HPROGFPM
-
stage: mechanistic insights (TransSynW + PAGA)
crgem run mechanistic insights --artefacts ./artefacts/[directory_name] --params [start_cell population] [start and terminal_cell population] [annotation file] [terminal cell cluster ID] [startaing cell cluster ID]
Eg: crgem run mechanistic_insights --artefacts ./artefacts/temp --params start.txt start_terminal.txt annotation.txt HPROGFPM HNES
-
stage: grn inference (SIGNET)
crgem run grn_inference --artefacts ./artefacts/[directory_name] --params ./data/terminal.csv
Eg: crgem run grn_inference --artefacts ./artefacts/temp --params ./data/terminal.csv
-
stage: trrust analysis (TRRUST)
crgem run trrust_analysis --artefacts ./artefacts/[directory_name] --params ./data/trrust_rawdata_human.tsv
Eg: crgem run trrust_analysis --artefacts ./artefacts/temp --params ./data/trrust_rawdata_human.tsv
-
stage: gene network (Cytoscape)
crgem run create_network --artefacts ./artefacts/[directory_name] --params ./artefacts/[directory_name]/Trrust_Analysis/trrust_analysis.csv
Eg: crgem run create_network --artefacts ./artefacts/temp --params ./artefacts/temp/Trrust_Analysis/trrust_analysis.csv
-
stage: functional analysis (Uniprot)
crgem run functional_analysis --artefacts ./artefacts/[directory_name] /Trrust_analysis/transsynw_genes.csv /Trrust_analysis/signet_genes.csv
Eg: crgem run functional_analysis --artefacts ./artefacts/temp /Trrust_analysis/transsynw_genes.csv /Trrust_analysis/signet_genes.csv