RNASeq

A pipeline for RNASeq analysis on paired-end reads implemented with NextFlow dsl2.

Workflow

Fastqc - Quality Check
Trim_galore - Adapter trimming and fastqc - trimmed reads are used for the rest of the workflow
Salmon - Index building and quantification
Hisat2 - Index building and Alignment
Samtools - sam to bam conversion, generate stats report with flagstat
FeatureCounts - Count genes, mRNAs, and genes with multi-mapping reads
Multiqc - Generate a multiqc report

Requirements

Nextflow
Either Singularity or Docker to use containers. If not using containers, these software/modules are needed: fastqc, trimgalore, salmon, hisat2, samtools, subread, and multiqc.
Git

Usage

Clone the repo using this code:

git clone [email protected]:SharuPaul/RNASeq.git

And run this command to get help statement:

nextflow run main.nf --help

Usage:
   nextflow run main.nf --indir <input data directory> -profile <nextflow profile(s)>

   Mandatory Arguments:         
    --indir                 Path to directory containing input data 

   Input data:      [Will look for data in directory specified in --indir by default, one or more of following 
                    need to be specified if in a different directory, a subdirectory, or in case of error in 
                    finding the data (glob pattern mismatch)]

    --reads                 Paired-end reads (glob pattern, e.g. "rawReads/*_{R1,R2}.fastq.gz")
    --cdna                  Reference cDNA file
    --fasta                 Reference genome fasta file
    --gff                   Reference genome GFF file
   
   Optional Arguments:    [default value]
    --threads               Number of threads [16]
    --outdir                Output directory name [RNAseq_Results]
    --trim_args             Additional arguments for trim_galore ["--fastqc"]
    --salmonindex           Path to salmon index. Provide directory containing prebuilt salmon index files 
                            [If not provided, index is built by default]
    --sal_quant_args        Additional arguments for salmon quant ["--libType=A --validateMappings"]
    --hisatindex            Path to hisat index. Provide directory containing prebuilt Hisat2 index files 
                            [If not provided, Hisat will build an index by default] 
    
   Nextflow Arguments: (notice single "-" instead of double "--") 
    -profile                Nextflow profiles available: singularity, docker, slurm
    -resume                 Resume last run

    --help                  Print this help statement

Run the pipeline using this command:

nextflow run main.nf --indir <input data directory> -profile <nextflow profile(s)>

Prebuilt indexes for salmon and hisat can be supplied, and addtitional nextflow arguments can also be used. The program will look for input data in directory specified by --indir by default. If some data is in a different folder or a subfolder, and it cannot be located automatically, then you can specify that using the appropriate arguments (e.g. --reads or --cdna ).

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Workflows		Workflows
configs		configs
modules		modules
.gitignore		.gitignore
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNASeq

Workflow

Requirements

Usage

About

Releases

Packages

Languages

SharuPaul/RNASeq

Folders and files

Latest commit

History

Repository files navigation

RNASeq

Workflow

Requirements

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages