nCovIllumina

This repository contains a pipeline for processing Illumina nCov data, using an existing ArticNextflow pipeline, and performing validation against variant calls from FreeBayes and against allele frequency thresholds.

Dependencies

This pipeline requires the following:

Conda, for both running the Nextflow pipeline and installing other software for validation
The artic Nextflow pipeline linked above installed in the working directory, or root access so that it can be installed as the first step of the pipeline
Java or openjdk
freebayes
samtools

The included environment.yml file contains a conda environment which includes openjdk, freebayes, samtools, and all of these tools' dependencies

Usage

To run the entire pipeline:

./pipeline.sh <datadir>
  Here <datadir> is a directory containing the sequencing reads as zipped FASTQ files.

To run only the iVar Nextflow pipeline (this is a subroutine of the overall pipeline)

./ivar.sh <datadir>

Outputs

The iVar Nextflow pipeline produces a directory named "results" within the working directory. This contains the following outputs (as well as other subfolders with intermediate files):

ncovIllumina_sequenceAnalysis_makeConsensus contains a FASTA file giving the iVar consensus genome sequences for all individuals (prefix.primertrimmed.consensus.fa)
ncovIllumina_sequenceAnalysis_callVariants contains the iVar variant calls in TSV format (prefix.variants.tsv)
ncovIllumina_sequenceAnalysis_trimPrimerSequences contains the read alignments to the reference after primer trimming is performed (prefix.mapped.primertrimmed.sorted.bam)

In addition, the next steps of the pipeline augment this results folder with the following subfolders:

freebayes contains the variant calls from freebayes (prefix.freebayes.vcf)
samtools contains the mpileup file (prefix.mpileup) as well as the variant calls based on an mpileup allele frequency cutoff of 0.15 (prefix.samtools.vcf)
merging contains a merged VCF containing VCFs from all three callers which has allele frequencies annotated (prefix.all_caller_freqs.vcf)

This enable the flagging of iVar variants which have lower than expected allele frequencies or which don't have support from other variant callers.

Docker container

A Dockerized version of this pipeline can be run using the following commands.

First build the Docker image with:

docker image build . -t ncovillumina

Because nextflow requires connection to Docker, we enable sibling containers by binding the docker socket to our container.

This container also attached an Illumina run folder to the /data folder within the container:

docker run --rm -it -v /var/run/docker.sock:/var/run/docker.sock --mount type=bind,source=/home/idies/workspace/covid19/illumina/200421_run1,target=/data ncovillumina bash

Once in the container, run the pipeline on the mounted folder with:

/home/idies/workspace/covid19/code/nCovIllumina/pipeline.sh /data

This will output a results folder in the active directory, which can be inspected in the container.

Once it is confirmed good to go, it can be copied over to the mounted data partition and accessed outside of the container:

mv results /data

Name	Name	Last commit message	Last commit date
Latest commit awe220 removed checking for empty directory Feb 20, 2021 74d0f9e · Feb 20, 2021 History 58 Commits
VariantValidator @ 907406a	VariantValidator @ 907406a	Add initial pipeline	Aug 17, 2020
artic-ncov2019 @ 335ead0	artic-ncov2019 @ 335ead0	Add simple config file and ability to parse out NextFlow parameters	Sep 10, 2020
config	config	Added PangoLEARN submodule	Dec 18, 2020
src	src	added minimal pipeline script and read counts to filtered reads	Feb 20, 2021
.gitignore	.gitignore	Add simple config file and ability to parse out NextFlow parameters	Sep 10, 2020
.gitmodules	.gitmodules	Added PangoLEARN submodule	Dec 18, 2020
Dockerfile	Dockerfile	Dockerfile with sciserver specific code removed	Feb 20, 2021
README.md	README.md	adding Docker usage text to README	Nov 3, 2020
bashrc	bashrc	adding SciServer image commits	Dec 15, 2020
environment.yml	environment.yml	moved env and ivar edits into repo	Dec 22, 2020
minimal_pipeline.sh	minimal_pipeline.sh	removed checking for empty directory	Feb 20, 2021
pipeline.sh	pipeline.sh	add getopts argument parsing	Dec 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nCovIllumina

Dependencies

Usage

Outputs

Docker container

About

Releases

Packages

Languages

jhuapl-bio/nCovIllumina

Folders and files

Latest commit

History

Repository files navigation

nCovIllumina

Dependencies

Usage

Outputs

Docker container

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages