Our Guthub repo include pipelines for large-scale SNP detection from RNA-seq data of different tissues per phenotype. Our pipeline mapped raw sequence reads against the genome using STAR aligner (2-pass method). Uniquely mapped reads were then pre-processed using GATK Best-Practices pipeline for RNA-seq data. This was followed by variant detection processes, and vigorous filtering of false-positive calls.
In this resaerch, the STAR pipeline was written in Workflow Description Language (WDL).
- Dependencies
a. Docker
b. Cromwell
c. Java
❗ For running paired-star-align.wdl you also need to build an index from Gencode-Annotation-File using STAR and save them in a seprated folder.
- Run the workflow directly by executing the following commands on your terminal:
java -Dconfig.file=application.conf -jar cromwell-55.jar run paired-star-align.wdl -i paired-star-align.json
Use the Pre-processing-picard.sh pipeline for this step.
- Required tools
a. picard.jar
- Required data
a. Bam files produced from STAR aligner
b. Homo_sapiens_assembly38.dict
Use the Variant-calling-GATK.sh pipeline for this step.
- Required tools
- Required data
a. Output of the last step of pre-processing pipeline
b. Homo_sapiens_assembly38.fasta
c. Homo_sapiens_assembly38.dbsnp138
d. Homo_sapiens_assembly38.dbsnp138.indexed
e. Homo_sapiens_assembly38.known-indels
f. Homo_sapiens_assembly38.known_indels.indexed
g. wgs_calling_regions.hg38.interval_list
Use the VCFs-merge-filter.sh pipeline for this step.
- Required tools
a. bcftools
b. tabix
- Required data
a. A list of selected VCFs