Project nf-EXPLOR | Nextflow pipeline for EXPloring variation in LOng REad Sequencing

*Currently deployed for Cattle SVs & SNPs Discovery in the Bovine Long-Read Consortium (BovLRC) *

Initial setup:

Clone this Github

git clone https://github.com/tuannguyen8390/AgVic_CLRC.git

Pipeline developed for usage in the Bovine Long-Read Consortium (BovLRC). The pipeline deployed multiple bioinformatics software for the detection of Single Nucldeotide Polymorphism (SNPs) & Structural Variants (SV). The pipeline (version 0.0.2) currently deployed. It was designed to deal with data from both Oxford Nanopore as well as PacBio (However we only test at the moment with ONT). Written in Nextflow DSL2.

Obtain & install Docker/Shifter/Singularity

Installation guide for Docker can be found here
Installation guide for Shifter can be found here
Installation guide for Singularity can be found here

Pull assets (genome) and perform some initial setup

Run the following command to pull assets (genome) and perform some initial setup (choose 1 among Shifter/Docker/Singularity only)

nextflow run setup.nf -profile shifter/docker/singularity

Test run the pipeline ((choose 1 among Shifter/Docker/Singularity only)

Edit the nextflow.config files to suit your local environment

nextflow run setup.nf -profile shifter/docker/singularity,test

Run the pipeline

nextflow run main.nf -profile shifter/docker/singularity

5*. If you run AWS, you can use the following command to run the pipeline

nextflow run main.nf -profile shifter/docker/singularity,awsbatch

The pipeline works using 2 metadata spreadsheet in the meta folder, in which:

metadata_SR.csv : metadata for short-read data

metadata_LR.csv : metadata for long-read data

Please refer to these files for editing your own. You can run with your own files deploying --LR_MetaDir AND/OR --SR_MetaDir

Pipeline overview

QC :

FiltLong : QC for both LongReads and ShortReads (DEFAULT)
NanoFilt + FMLRC2 : NanoFilt for QC of Long-Read samples, and FMLRC2 + NanoFilt for QC of Short-Read samples .

Mapping:

Minimap2 : (DEFAULT)
Winnowmap2
NGMLR

SNP Caller: All callers are run in parallel & deploy per chromosome (1 to 29 & X as the pipe currently deployed in cattle)

Clair3 : (RECOMMEND FOR DOWNSTREAM ANALYSIS)
PEPPER - By default, Flowcell < 10.4 will be analyzed with PEPPER
DEEPVARIANT - By default, Flowcell >= 10.4 will be analyzed with DEEPVARIANT & HIFI (RECOMMEND FOR DOWNSTREAM ANALYSIS)
Longshot

SV Caller: All callers are run in parallel

Sniffles2 (RECOMMEND FOR DOWNSTREAM ANALYSIS)
DYSGU (RECOMMEND FOR DOWNSTREAM ANALYSIS)
CuteSV2 (RECOMMEND FOR DOWNSTREAM ANALYSIS)

Reporting

PRE/POST QC : NanoPlot
Alignment Depth : Mosdepth

Extra process for Nanopore

PorechopABI

I've absolutely no doubt that there should be some problems :). It runs on my end, but perhaps not yours. If that is the case, please email to Tuan Nguyen

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
bin		bin
conf		conf
docs		docs
meta		meta
modules		modules
workflows		workflows
.gitignore		.gitignore
README.md		README.md
cleanup.sh		cleanup.sh
main.nf		main.nf
nextflow.config		nextflow.config
setup.nf		setup.nf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project nf-EXPLOR | Nextflow pipeline for EXPloring variation in LOng REad Sequencing

Pipeline overview

About

Releases

Packages

Languages

RenscoHogers/nf-EXPLOR

Folders and files

Latest commit

History

Repository files navigation

Project nf-EXPLOR | Nextflow pipeline for EXPloring variation in LOng REad Sequencing

Pipeline overview

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages