Q2_ITSxpress: A Qiime2 plugin to rapidly trim the Internally transcribed spacer (ITS) region of FASTQ files
- Adam R. Rivers, US Department of Agriculture, Agricultural Research Service
- Kyle C. Weber, US Department of Agriculture, Agricultural Research Service
Rivers AR, Weber KC, Gardner TG et al. ITSxpress: Software to rapidly trim internally transcribed spacer sequences with quality scores for marker gene analysis. F1000Research 2018, 7:1418. doi: 10.12688/f1000research.15704.1
The internally transcribed spacer (ITS) is a region between the small subunit and large subunit rRNA genes. In is a commonly used phylogenetic marker for Fungi and other Eukaryotes. The ITS contains the 5.8s gene and two variable length spacer regions. In amplicon sequencing studies it is common practice to trim off the conserved (SSU, 5,8S or LSU) regions. Bengtsson-Palme et al. (2013) published a software package ITSx to do this.
Q2_ITSxpress extends this work by rapidly trimming FASTQ sequences within Qiime2. Q2_ITSxpress is the Qiime2 plugin version of the stand alone command line utility ITSxpress. Q2_ITSxpress is designed to support the calling of exact sequence variants rather than OTUs. This newer method of sequence error-correction requires quality score data from each sequence, so each input sequence must be trimmed. ITSxpress makes this possible by taking FASTQ data, de-replicating the sequences then identifying the start and stop sites using HMMSearch. Results are parsed and the trimmed files are returned. The ITS1, ITS2 or the entire ITS region including the 5.8s rRNA gene can be selected. ITSxpress uses the hmm models from ITSx so results are nearly identical.
- Qiime2 is required to run Q2-itsxpress (for stand alone software see ITSxpress)
- To install Qiime2 follow these instructions: https://docs.qiime2.org/2019.10/install/
- Activate the Qiime2 conda environment
source activate qiime2-2019.10
- Install Q2_itsxpress using BioConda. Be sure to install Q2_itsxpres in the Qiime2 environment.
conda install -c bioconda itsxpress
pip install q2-itsxpress
- In your Qiime2 environment, refresh the plugins.
qiime dev refresh-cache
- Check to see if the ITSxpress plugin is installed. You should see an output similar to the image below.
qiime itsxpress
Within Qiime2 you can trim paired-end or single-end reads using these commands
qiime itsxpress trim-pair
qiime itsxpress trim-pair-output-unmerged
qiime itsxpress trim-single
- qiime itsxpress trim-single
This command takes single-end data and returns trimmed reads. The sequence may have been merged previously or have been generated from a long read technology like PacBio. Merged and long reads trimmed by this function can be used by Deblur but only long reads (not merged reads) trimmed by this function should be passed to Dada2. Its statistical model for estimating error rates was not designed for pre-merged reads.
Command-requirement | Description |
--i-per-sample-sequences |
|
--p-region |
|
--p-taxa |
|
--p-threads |
|
--o-trimmed |
|
--cluster-id |
|
- qiime itsxpress trim-pair
This command takes paired-end data and returns merged, trimmed reads. The merged reads trimmed by this function can be used by Deblur but not Dada2. Its statistical model for estimating error rates was not designed for pre-merged reads, instead use qiime itsxpress trim-pair-output-unmerged.
Command-requirement | Description |
--i-per-sample-sequences |
|
--p-region |
|
--p-taxa |
|
--p-threads |
|
--o-trimmed |
|
--cluster-id |
|
- qiime itsxpress trim-pair-output-unmerged
This command takes paired-end data and returns unmerged, trimmed reads. The merged reads trimmed by this function can be used by Dada2 but not Deblur. For Deblur use qiime itsxpress trim-pair.
Command-requirement | Description |
--i-per-sample-sequences |
|
--p-region |
|
--p-taxa |
|
--p-threads |
|
--o-trimmed |
|
--cluster-id |
|
A | Alveolata | |
B | Bryophyta | |
C | Bacillariophyta | |
D | Amoebozoa | |
E | Euglenozoa | |
F | Fungi | |
G | Chlorophyta (green algae) | |
H | Rhodophyta (red algae) | |
I | Phaeophyceae (brown algae) | |
L | Marchantiophyta (liverworts) | |
M | Metazoa | |
O | Oomycota | |
P | Haptophyceae (prymnesiophytes) | |
Q | Raphidophyceae | |
R | Rhizaria | |
S | Synurophyceae | |
T | Tracheophyta (higher plants) | |
U | Eustigmatophyceae | |
ALL | All |
Use case: Trimming the ITS2 region from a fungal amplicon sequencing dataset with a PairedSequencesWithQuailty qza using two cpu threads. The example file used is in the Tests folder under paired.qza.
qiime itsxpress trim-pair --i-per-sample-sequences ~/parired.qza --p-region ITS2 \
--p-taxa F --p-threads 2 --o-trimmed ~/Desktop/out.qza
This software is a work of the United States Department of Agriculture, Agricultural Research Service and is released under a Creative Commons CC0 public domain attribution.