Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 5 revisions

Biopiece: invert_align

Description

invert_align is useful to locate mismatches or other differences in an alignment between the reference sequence (the first sequence in the alignment) and the remaining sequences. Invertion can be 'hard' where matching residues are shown as - or 'soft' where matching residues are shown in lower case. In both cases, mismatches are shown as capital letters and gaps or missing sequence is shown as _.

Usage

... | invert_align [options]

Options

[-?         | --help]               #  Print full usage description.
[-s         | --soft]               #  Use soft inversion instead of hard inversion.
[-I <file!> | --stream_in=<file!>]  #  Read input from stream file  -  Default=STDIN
[-O <file>  | --stream_out=<file>]  #  Write output to stream file  -  Default=STDOUT
[-v         | --verbose]            #  Verbose output.

Examples

Consider the alignment in the file aln.fna in FASTA format:

>test1
CTAGC-TTCGACT
>test2
--AGC-TTCGA--
>test3
--AGCTTTCGA--
>test4
--AG--CTCGA--
>test5
--AG--TTCGAC-

Reading the alignment using read_fasta results in:

read_fasta -i aln.fna | write_align -x

                          .   
test1            CTAGC-TTCGACT
test2            --AGC-TTCGA--
test3            --AGCTTTCGA--
test4            --AG--CTCGA--
test5            --AG--TTCGAC-
Consensus: 50%   --AG--TTCGA--

However, if we insert an instance of invert_align it is clear where the sequence differences are:

read_fasta -i aln.fna | invert_align | write_align -x

                          .   
test1            CTAGC_TTCGACT
test2            __---------__
test3            __---T-----__
test4            __--_-C----__
test5            __--_-------_
Consensus: 50%   -------------

And if we instead of hard inverting the sequence uses the -s switch of invert_align to obtain soft inverted alignment, where the matching residues are in lower case letters instead of represented as -, we get:

read_fasta -i aln.fna | invert_align -s | write_align -x

                          .   
test1            CTAGC_TTCGACT
test2            __agc_ttcga__
test3            __agcTttcga__
test4            __ag__Ctcga__
test5            __ag__ttcgac_
Consensus: 50%   --AG--TTCGA--

See also

read_fasta

write_align

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

August 2007

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

invert_align is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally