From 5a7ee7996dfe6527f7665e09718a141436e8e7d3 Mon Sep 17 00:00:00 2001 From: gbouras13 Date: Sun, 28 Apr 2024 14:41:17 +0930 Subject: [PATCH] colab --- README.md | 45 ++++++++++++++++++++++++++++++++++++++------- docs/run.md | 4 ++++ 2 files changed, 42 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index bd44979..809ea97 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,5 @@ +[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/gbouras13/plassembler/blob/main/run_plassembler.ipynb) + [![Paper](https://img.shields.io/badge/paper-Bioinformatics-teal.svg?style=flat-square&maxAge=3600)](https://doi.org/10.1093/bioinformatics/btad409) [![CI](https://github.com/gbouras13/plassembler/actions/workflows/ci.yaml/badge.svg)](https://github.com/gbouras13/plassembler/actions/workflows/ci.yaml) [![BioConda Install](https://img.shields.io/conda/dn/bioconda/plassembler.svg?style=flag&label=BioConda%20install)](https://anaconda.org/bioconda/plassembler) @@ -10,19 +12,15 @@ [![Downloads](https://static.pepy.tech/badge/plassembler)](https://pepy.tech/project/plassembler) [![DOI](https://zenodo.org/badge/514596389.svg)](https://zenodo.org/doi/10.5281/zenodo.10035954) - # plassembler ## Automated Bacterial Plasmid Assembly Program -`plassembler` is a program that is designed for automated & fast assembly of plasmids in bacterial genomes that have been hybrid sequenced with long read & paired-end short read sequencing. It was originally designed for Oxford Nanopore Technologies long reads, but will also work with Pacbio reads. As of v1.3.0, it should also work well for long-read only assembled genomes (although we would still recommend getting short reads too if you can). +`plassembler` is a program that is designed for automated & fast assembly of plasmids in bacterial genomes that have been hybrid sequenced with long read & paired-end short read sequencing. It was originally designed for Oxford Nanopore Technologies long reads, but it will also work with Pacbio reads. As of v1.3.0, it also works well for long-read only assembled genomes. -If you are assembling a small number of bacterial genomes manually, I would recommend starting by using [Trycycler](https://github.com/rrwick/Trycycler) to recover the chromosome before using Plassembler to recover plasmids, especially the small ones. If you have more genomes or want to assemble your genomes in a more automated way, try [dragonflye](https://github.com/rpetit3/dragonflye), especially if you are used to Shovill, or even better my own pipeline [hybracter](https://github.com/gbouras13/hybracter) that is more appropriate for large datasets and implemented Plassembler in it. +If you are assembling a small number of bacterial genomes manually, I would recommend starting by using [Trycycler](https://github.com/rrwick/Trycycler) to recover the chromosome before using Plassembler to recover plasmids, especially the small ones. -Additionally, I would recommend reading the following guides to bacterial genome assembly regardless of whether you want to use Plassembler: -* [Trycycler](https://github.com/rrwick/Trycycler/wiki/Guide-to-bacterial-genome-assembly) -* [Perfect Bacterial Assembly Tutorial](https://github.com/rrwick/Perfect-bacterial-genome-tutorial) -* [Perfect bacterial assembly Paper](https://doi.org/10.1371/journal.pcbi.1010905) +Otherwise, I recommend you _don't_ actually use Plassembler by itself. If you have more genomes or want to assemble your genomes in a more automated way, **I would recommend [Hybracter](https://github.com/gbouras13/hybracter)**. If you use Hybracter, you will not need to use Plassembler separately, as it is built in. But please still [cite](#citations) Plassembler. ## Quick Start @@ -40,6 +38,33 @@ And finally run `plassembler`: Please read the [Installation](#installation) section for more details, especially if you are an inexperienced command line user. +### Container + +Alternatively, a Docker/Singularity Linux container image is available for Plassembler (starting from v1.6.2) [here](https://quay.io/repository/gbouras13/plassembler). This will likely be useful for running Plassembler in HPC environments. + +To install and run v1.6.2 with singularity + +```bash + +IMAGE_DIR="" +singularity pull --dir $IMAGE_DIR docker://quay.io/gbouras13/plassembler:1.6.2 + +containerImage="$IMAGE_DIR/plassembler_1.6.2.sif" + +# example command with test fastqs +singularity exec $containerImage plassembler download -d plassembler_db +singularity exec $containerImage plassembler run -l test_data/Fastqs/test_long_reads.fastq.gz \ + -1 test_data/Fastqs/test_short_reads_R1.fastq.gz -2 test_data/Fastqs/test_short_reads_R2.fastq.gz d plassembler_db \ + -o output_test_singularity -t 4 -c 50000 +``` + +### Google Colab Notebook + +If you don't want to install `plassembler` locally, you can run it without any code using the colab notebook [https://colab.research.google.com/github/gbouras13/plassembler/blob/main/run_plassembler.ipynb](https://colab.research.google.com/github/gbouras13/plassembler/blob/main/run_plassembler.ipynb) + +This is only recommend if you have one or a few samples to assemble (it takes a while per sample due to the limited nature of Google Colab resources - probably an hour or two a sample). If you have more than this, a local install is recommended. + + ## Manuscript `plassembler` has been recently published in *Bioinformatics*: @@ -57,6 +82,8 @@ The full documentation for Plassembler can be found [here](https://plassembler.r - [plassembler](#plassembler) - [Automated Bacterial Plasmid Assembly Program](#automated-bacterial-plasmid-assembly-program) - [Quick Start](#quick-start) + - [Container](#container) + - [Google Colab Notebook](#google-colab-notebook) - [Manuscript](#manuscript) - [Documentation](#documentation) - [Table of Contents](#table-of-contents) @@ -146,6 +173,10 @@ Please see [here](docs/multiple_chromosomes.md) for more details and an example. * If you have sufficient hybrid sequencing data, Plassembler will theoretically recover assemblies of all non-chromosomal replicons, including phages and phage-plasmids * A good example of this is the _Vibrio campbellii DS40M4_ example, where Plassembler recovered the assembly of phage phiX174, albeit it was from sequencing spike-in contamination in that case. +5. Plasmid Only Assembly + +* You can also use Plassembler for plasmid-only assembly by passing `--no_chromosome`. Use this if your reads only contain plasmids that you would like to assemble. + ## Quality Control * `plassembler` can also be used for quality control to test whether your long and short read sets come from the same isolate, even within the same species. diff --git a/docs/run.md b/docs/run.md index c0673a9..7698828 100644 --- a/docs/run.md +++ b/docs/run.md @@ -53,6 +53,10 @@ To use assembled mode to calculate plasmid copy numbers, you need to use `plasse `plassembler assembled -d -l -o -1 < short read R1 fastq> -2 < short read R2 fastq> -c -t -a --input_chromosome --input_plasmids ` +You can also use Plassembler for plasmid-only assembly by passing `--no_chromosome`. Use this if your reads only contain plasmids that you would like to assemble. + +`plassembler run -d -l -o -1 < short read R1 fastq> -2 < short read R2 fastq> -t --no_chromosome` + ``` Usage: plassembler run [OPTIONS]