This snakemake pipeline is designed for paired-end NGS DNA
- Path of fastqc files
- Name of outout folder
- Reference genome
multiqc_data\
- Dictionary containing the summary results of all the tools, inculde multiqc.htmllogs\
- Directory of log files for each job, check here first if you run into errorsworking\
- Directory containing intermediate files for each job
- **QC--fastqc
- **Trimming--trim galore
- **QC--fastqc
- **Align--bwa
- **Sort--samtools
- **Deduplicate--picard
- **Summary--multiqc
-
Install conda
-
Clone workflow into working directory
git clone <repo> <dir> cd <dir>
-
Create a new enviroment
conda env create -n <project_name> --file environment.yaml
-
Activate the environment
conda activate <project_name>
-
Enable the Bioconda channel
conda config --add channels bioconda conda config --add channels conda-forge
-
Install snakemake
conda install snakemake
-
Edit configuration files
change the path of fastq_dir, output_dir, reference_genome in "config.yaml"
-
Execute the workflow.
- The first time you are executing this snakemake pipeline it should run locally, once the first run is over (you can use --dry), you can switch to running it on the cluster.
snakemake --configfile "config.yaml" --use-conda --cores N