Skip to content

sekhwal/RNASeq-workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 

Repository files navigation

#Prepare the data
#marge the parts of the samples
$cat *R1_001.fastq.gz >578_R1.fastq.gz
$cat *R2_001.fastq.gz >578_R2.fastq.gz
#Trim adaptors:


Method 1. Alignment based Method

1. Prepare downstream with downloading all required softwares and pipelines.
-List of pipelines
  -FASTQC
  -MultiQC
  -Hisat
  -Samtools
  -picard
  -subread
$Unzip command
$tar -jxvf <filename.bz2>
$tar -zxf <filename.zip>
$Unzip <filename.zip>
$gunzip <filename.gz>

2. RUN FastQC with all fastq files:

$fastqc -t 6 *.fastq.gz

3. RUN MultiQC-master
$multiqc .

4. STAR Alignment
-Download STAR larest version (https://github.com/alexdobin/STAR/releases)
-Extract the file and compile with following commands
$cd STAR/source
$make STAR
Add to the directory
$export PATH=$PATH:/Home/pipelines/
---Alternatively:
-Get latest STAR source from releases
wget https://github.com/alexdobin/STAR/archive/2.7.5a.tar.gz
tar -xzf 2.7.5a.tar.gz
cd STAR-2.7.5a

# Alternatively, get STAR source using git
git clone https://github.com/alexdobin/STAR.git

# Compile
cd STAR/source
make STAR

#RUN the STAR
$./STAR --runThreadN 8 --runMode genomeGenerate --genomeDir /DataAnalysis/data/test/ --genomeFastaFiles 
/DataAnalysis/data/test/Homo_sapiens.GRCh38.dna.toplevel.fa --sjdbGTFfile /DataAnalysis/data/test/Homo_sapiens.GRCh38.100.gtf 
--sjdbOverhang 100

$./STAR --runMode alignReads --outSAMtype BAM Unsorted --genomeDir /DataAnalysis/data/STAR/pipelines/ --outFileNamePrefix 
/DataAnalysis/data/STAR/pipelines/prefix --readFilesIn /DataAnalysis/data/STAR/pipelines/2020_278C_R1.fastq /DataAnalysis/data/STAR/pipelines/2020_278C_R2.fastq

###Method 2: Direct method 

 cd /DataAnalysis/data/pipelines/kallisto/
 $./kallisto index -i /DataAnalysis/data/kallisto_test/transcripts.idx /DataAnalysis/data/kallisto_test/Homo_sapiens.GRCh38.cdna.all.fa.gz 
 $./kallisto quant -i /DataAnalysis/data/kallisto_test/transcripts.idx -o /DataAnalysis/data/kallisto_test/ -b 100 
 /DataAnalysis/data/kallisto_test/2020_288_R1.fastq.gz /DataAnalysis/data/kallisto_test/2020_288_R2.fastq.gz 

###########
#Remove rRNA contamination
tagdust -t 8 -1 R:N -o /DataAnalysis/tagduest-analysis/ -ref /DataAnalysis/tagduest-analysis/rRNA.txt -fe 2 /DataAnalysis/tagduest-analysis/270_R1.fastq 

OR

tagdust  -1 O:N -2 B:ACAGAT,ATCGTG,CACGAT,CACTGA,CTGACG,GAGTGA,GTATAC,TCGAGC-o /DataAnalysis/tagduest-analysis/ -ref /DataAnalysis/tagduest-analysis/mart_export.txt -fe 2 /DataAnalysis/tagduest-analysis/270_R1.fastq


./bbduk.sh -Xmx64g in1=/DataAnalysis/test-star/output_READ1.fastq in2=/DataAnalysis/test-star/output_READ2.fastq out=/DataAnalysis/test-star/trimmed-ribo.fa ref=/DataAnalysis/test-star/mart_export.txt 

About

Descriptive manual for RNASeq workflow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published