Skip to content
/ TnPrep Public

Nextflow pipeline for preprocessing of TnSeq data

Notifications You must be signed in to change notification settings

MDHowe4/TnPrep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TnPrep

Description

TnPrep is a Tn-seq Nextflow pipeline for QC, mapping and counting of Himar1 mariner transposon insertion read sequencing data to positions within a supplied bacterial reference genome following the schema outlined in .

The output of this pipeline is individual count matrices in .wig format containing insertion counts mapped to all TA sites found within the reference genome and QC information in the form of a MultiQC report. This .wig count file is compatible with TRANSIT or other tools for downstream Tn-seq data processing and analysis.

Requirements

1. A POSIX compatible system (Linux, OS X, WSL (tested on Ubuntu), etc)

2.Java 11 or later (up to 18)

3. Install Nextflow ( >=22.10.7 ). Older versions may work, but are untested. ( this tutorial can be helpful to setup an environment to run Nextflow in Windows, just skip dev tool installations )

4. Install any of Docker, Podman, or Singularity ( tutorial can be found here )

5. Tn-seq FASTA files in gzip-compressed .fa.gz format

Running TnPrep

TnPrep can be automatically fetched or updated directly using the following command

nextflow pull MDHowe4/TnPrep

The pipeline can also be fetched by running directly on a file directory containing Tn-seq data in a compatible format. Running TnPrep requires supply of an input and output directory, as well as a reference genome in FASTA format

nextflow run MDHowe4/TnPrep -profile docker/singularity/podman \
                            --input </path/to/input_file_directory> \
                            --genome </path/to/fasta_DNA_reference> \
                            --output </path/to/output_directory>

Parameters:

--input: Path to the input files directory

--genome: Absolute path to the DNA reference file in Fasta format

--output: Path to the output file directory

NOTE: All files in the input file directory should be in the same file format for compatibility with this pipeline.

NOTE: The first time you execute this pipeline it may take some time grab TnPrep from the GitHub repository and download the necessary container image comprising the dependecies needed to successfully run TnPrep.

Pipeline Schema

tba

Software

Program Version
fastqc 0.11.9
cutadapt 4.1
bowtie2 2.5.1
multiqc 1.14
biopython 1.81

About

Nextflow pipeline for preprocessing of TnSeq data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published