GitHub - Geize/BactGenomeAnalysis: Bacterial genome assembly and prediction -snakemake

Bacterial genome assembly and prediction - snakemake

Hi Folks! 😀

First thing first. Our mantra 🕉️ : This repository is not a tutorial. It is just for reproducing my work. However, you are more than welcome to use this workflow. And if you find any error, please, you'll welcome as well to complain (but not too much 😊). I'll be glad to fix it.

The workflow will do:

Quality control of llumina MiSeq reads (paired-end reads, PE) - FastQC.
Trimmed the raw reads - Trimmomatic.
Assembly the quality-filtered paired-end reads - De novo assembly - SPAdes.
Quality assessment for evaluating genome assembled - QUAST.
Detection chimera or contamination - GUNC.
Prediction and annotation - Prokka.
Folders have the same name of each tool used.

Important points:

Create a folder named reads/ and transfer your "fastq.gz" to this folder. Then, rename your "fastq.gz" files to {dadada}_1.fastq.gz and {dadada}_2.fastq.gz.

From my repository: download to your area the file with all PE sequence adapters in Adapter folder for trimming step, and GenomeAnalysis.yaml file in env folder to recreate the the same environment that I use to process my data.

$ conda env create -n snake -f GenomeAnalysis.yaml

$ conda activate snake

Now, everything is ready to run the workflow.

**Additional information:**🔥

SPAdes is still the best assembler for bacterial genome assembly (considering that you are using PE). That's why you won't find another assembler as a second option. However, if you still want to try another assembler, it is very easy to add a new rule or replace the current one in the workflow (but, you'll be in charge to do it 😄).

QUAST - Give an idea about how good your assembly is. But, QUAST was not set up for comparing genome assemblies. I guess you can easily have a better comparison going directly to the NCBI genome.

GUNC - This is a new tool for detecting and quantifying chimerism. In my opinion, it is better than CheckM. Don't forget to specify the GUNC database path installed on your computer/server.

All the best for us.

Name	Name	Last commit message	Last commit date
Latest commit Geize v 1.0.1 Jan 15, 2023 f2dba99 · Jan 15, 2023 History 12 Commits
adapters	adapters	v 1.0.0	Nov 16, 2022
env	env	v 1.0.0	Nov 16, 2022
workflow	workflow	v 1.0.1	Jan 15, 2023
.DS_Store	.DS_Store	v 1.0.1	Jan 15, 2023
.Rhistory	.Rhistory	v 1.0.0	Nov 16, 2022
README.html	README.html	Typos	Jan 2, 2023
README.md	README.md	v 1.0.1	Jan 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bacterial genome assembly and prediction - snakemake

About

Releases

Packages

Languages

Geize/BactGenomeAnalysis

Folders and files

Latest commit

History

Repository files navigation

Bacterial genome assembly and prediction - snakemake

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages