Skip to content

Snakemake workflow for RNA-Seq differential transcript analysis using Salmon/Deseq2

License

Notifications You must be signed in to change notification settings

niekwit/rna-seq-salmon-deseq2

Repository files navigation

Snakemake workflow: rna-seq-salmon-deseq2

Snakemake Tests DOI

A Snakemake workflow for wicked-fast paired-end RNA-seq analysis with Salmon and DESeq2.

If you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this (original) repository and its DOI (see above).

Software dependencies

Usage

Preparing data and code

Create a main analysis directory with the subdirectories config/, reads/, and workflow/.

Place all your paired-end fastq files files in the reads folder. These should have the extensions _R1_001.fastq.gz/_R2_001.fastq.gz for read 1 and read2, respectively.

The config/ directory should contain two files: config.yml and samples.csv.

Meta information of the samples are described in samples.csv:

sample genotype treatment reference batch
Control_1 WT Normoxia yes 1
Control_2 WT Normoxia yes 1
Control_Hypoxia_1 WT Hypoxia no 1
Control_Hypoxia_2 WT Hypoxia no 1

Important

The sample names should correspond to the files name, eg. Control_1_R1_001.fastq.gz and Control_1_R2_001.fastqz for sample Control_1.

Analysis settings and resource

genome: human # human or mouse
gencode_genome_build: 44
fdr_cutoff: 0.05 # adj p value cut off for volcano plots
fc_cutoff: 0.5 # log2 fold change cut off for volcano plots
salmon-quant: 
  extra_params: "" # additional arguments to pass to Salmon
salmon-index:
  extra_params: "--gencode"
deseq2:
  # custom model for DESeq2
  design: "" 
resources: # computing resources
  trim:
    cpu: 8
    time: 60
  fastqc:
    cpu: 4
    time: 60
  mapping:
    cpu: 8
    time: 120
  deseq2:
    cpu: 6
    time: 60 
  plotting:
    cpu: 2
    time: 20