GitHub - ablab/metaGT: Pipeline for metagenome and metatranscriptome joint assembly

Assembly and quantification metatranscriptome using metagenome data.

Version: see VERSION

Introduction

MetaGT is a bioinformatics analysis pipeline used for improving and quantification metatranscriptome assembly using metagenome data. The pipeline supports Illumina sequencing data and complete metagenome and metatranscriptome assemblies. The pipeline involves the alignment of metatranscriprome assembly to the metagenome assembly with further extracting CDSs, which are covered by transcripts.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies.

Quick Start

Install nextflow
Install any of Conda for full pipeline reproducibility
Download the pipeline, e.g. by cloning metaGT GitHub repository:
```
git clone [email protected]:ablab/metaGT.git
```
Test it on a minimal dataset by running:
```
nextflow run metaGT -profile test,conda
```

Start running your own analysis!

Typical command for analysis using reads:

nextflow run metaGT -profile <conda> --dna_reads '*_R{1,2}.fastq.gz' --rna_reads '*_R{1,2}.fastq.gz'

Typical command for analysis using multiple files with reads:

nextflow run metaGT -profile <conda> --dna_reads '*.yaml' --rna_reads '*.yaml' --yaml

Typical command for analysis using assemblies:

nextflow run metaGT -profile <conda> --genome '*.fasta' --transcriptome '*.fasta'

Pipeline Summary

Optionally, if raw reades are used:

Sequencing quality control (FastQC)
Assembly metagenome or metatranscriptome (metaSPAdes, rnaSPAdes )

By default, the pipeline currently performs the following:

Annotation metagenome (Prokka)
Aligning metatranscriptome on metagenome (minimap2)
Annotation unaligned transcripts (TransDecoder)
Clustering covered CDS and CDS from unaligned transcripts (MMseqs2)
Quantifying abundances of transcripts (kallisto)

Citation

MetaGT was developed by Daria Shafranskaya and Andrey Prjibelski. If you use it in your research please cite:

MetaGT: A pipeline for de novo assembly of metatranscriptomes with the aid of metagenomic data

Feedback and bug report

If you have any questions, please leave an issue at out GitHub page.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
assets		assets
bin		bin
conf		conf
data		data
lib		lib
modules		modules
subworkflows/local		subworkflows/local
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
environment.yml		environment.yml
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Quick Start

Pipeline Summary

Citation

Feedback and bug report

About

Releases 1

Packages

Contributors 2

Languages

License

ablab/metaGT

Folders and files

Latest commit

History

Repository files navigation

Introduction

Quick Start

Pipeline Summary

Citation

Feedback and bug report

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages