Skip to content
martinghunt edited this page Aug 9, 2022 · 13 revisions

Installation

Docker

Get a Docker image of the latest release:

docker pull ghcr.io/iqbal-lab-org/minos:latest

All Docker images are listed in the packages page.

Alternatively, build your own Docker image:

sudo docker build --network=host .

Singularity

Releases include a Singularity image to download (from version 0.12.1 onwards).

Alternatively, build your own Singularity image:

singularity build minos.simg Singularity.def

From source

Dependencies:

  • Python 3 (tested on version 3.6.9)
  • gramtools commit 8af53f6c8c0d72ef95223e89ab82119b717044f2
  • bcftools
  • vt
  • vcflib. Specifically, either vcflib, or all three of vcfbreakmulti, vcfallelicprimitives, and vcfuniq must be installed.
  • Optionally, nextflow and ivcfmerge if you want to use the pipeline to regenotype a large number of samples.

Install by cloning this repository (or downloading the latest release), and running:

pip3 install .

Quick start

Basic instructions are below. For more detail, please see the help pages running on one sample or joint genotyping.

See the Minos test data page for how to test the installation using toy data.

Run on a single sample

To run on one sample, you will need:

  • A FASTA file of the reference genome.
  • One or more VCF files of variant calls. The only requirement of these files is that they must contain the genotype field GT, and correspond to the reference FASTA file. All variants with a non-reference genotype call will be used (both alleles are considered for diploid calls).
  • Illumina reads in FASTQ file(s).

For example, if you have two call sets in the files calls1.vcf and calls2.vcf, then run:

minos adjudicate --reads reads1.fq --reads reads2.fq out ref.fasta calls1.vcf calls2.vcf

where reads1.fq and reads2.fq are FASTQ files of the reads and ref.fasta is a FASTA of the reference corresponding to the two input VCF files. The final call set will be out/final.vcf.

You can use one or more VCF files as input - Minos will consume all VCF files listed at the end of the command - it does not have to be exactly two files like in the example.

Joint genotype many samples

For each sample, you will need a name, a VCF file of calls and a sorted indexed BAM file of reads. Put this information in a tab-delimited file which must have column names name, vcf, reads, which is called manifest.tsv in the example command below.

With minos installed, use the nextflow file in the minos repository nextflow/regenotype.nf and the nextflow config file nextflow/config.nf, the command is:

nextflow run \
  -c nextflow/config.nf \
  -profile medium \
  nextflow/regenotype.nf \
  --ref_fasta <PATH/TO/REFERENCE.FASTA> \
  --manifest manifest.tsv \
  --outdir <PATH/TO/OUTPUT/DIRECTORY>

You will need to replace the value of --ref_fasta with the path to the reference FASTA file, and the value of --outdir with the name of the output directory - this directory is made by the pipeline and should not already exist.

If you want to use singularity instead of having minos installed, then tell nextflow with the option -with-singularity minos_container.simg.

Citation

Minos: variant adjudication and joint genotyping of cohorts of bacterial genomes.

Martin Hunt, Brice Letcher, Kerri M Malone, Giang Nguyen, Michael B Hall, Rachel M Colquhoun, Leandro Lima, Michael Schatz, Srividya Ramakrishnan, The CRyPTIC Consortium, Zamin Iqbal. Genome Biology 23, 147 (2022).

doi: https://doi.org/10.1186/s13059-022-02714-x