-
Notifications
You must be signed in to change notification settings - Fork 23
indel_seq
Martin Asser Hansen edited this page Oct 2, 2015
·
6 revisions
indel_seq introduces indels (insertions and deletions) into sequences in the stream based on either an exact number of indels, or a percentage of the sequence length. Insertions are introduced at random positions as duplication of the adjecent residues. Deletions are made at random positions.
... | mutate_seq [options]
[-? | --help] # Print full usage description.
[-i <uint> | --insertions=<uint>] # Number of insertions - Default=0
[-P <float> | --insertions_percent=<float>] # Percent residues to insert - Default=0.0
[-d <uint> | --deletions=<uint>] # Number of deletions - Default=0
[-D <float> | --deletions_percent=<float>] # Percent residues to delete - Default=0.0
[-I <file!> | --stream_in=<file!>] # Read input from stream file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output to stream file - Default=STDOUT
[-v | --verbose] # Verbose output.
Consider the following entry in the FASTA file test.fna
:
>test
ATGTGCACATTCGACTAGCA
Read in the sequence with read_fasta:
read_fasta -i test.fna | indel_seq -i 2
SEQ: ATGTGCACCATTCGACTAGGCA
SEQ_NAME: test
SEQ_LEN: 22
---
Or in the same go delete 30% of the residues using the -D
switch:
read_fasta -i test.fna | indel_seq -i 2 -D 30
SEQ: TGTGACATTGACAGCC
SEQ_NAME: test
SEQ_LEN: 16
---
And you can always re-read the original sequence and compare like this:
read_fasta -i test.fna | indel_seq -i 2 -D 30 | align_seq | write_align -x
.
test -TGTG-ACATTC-ACTA-CA
|||| |||||| |||| ||
test ATGTGCACATTCGACTAGCA
. .
Martin Asser Hansen - Copyright (C) - All rights reserved.
June 2009
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
indel_seq is part of the Biopieces framework.