-
Notifications
You must be signed in to change notification settings - Fork 23
split_vals
Martin Asser Hansen edited this page Oct 2, 2015
·
5 revisions
split_vals splits the value of a given key into multiple values that are added to the
record. The keys used for the values are per default based on the given key with an added
index, but using the -K
switch allows specifying a list of keys to use instead.
... | split_vals [options]
[-? | --help] # Print full usage description.
[-k <string> | --key=<string>] # Key with value to split.
[-K <list> | --keys=<list>] # List of keys to use with split values.
[-d <string> | --delimit=<string>] # Delimiter to split values by - Default='_'
[-I <file!> | --stream_in=<file!>] # Read input from stream file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output to stream file - Default=STDOUT
[-v | --verbose] # Verbose output.
Consider the following record:
SEQ: ACUGCUACGUACGUACGUACG
SEQ_LEN: 21
SEQ_NAME: test_chr4_24477_24515_+
---
To split the SEQ_NAME into multiple key/value pairs use split_vals like this:
... | split_vals -k SEQ_NAME
SEQ: ACUGCUACGUACGUACGUACG
SEQ_LEN: 21
SEQ_NAME: test_chr4_24477_24515_+
SEQ_NAME_0: test
SEQ_NAME_1: chr4
SEQ_NAME_2: 24477
SEQ_NAME_3: 24515
SEQ_NAME_4: +
---
Now, to specify a list of key names use the -K
switch like this:
... | split_vals -k SEQ_NAME -K SEQ_NAME,CHR,CHR_BEG,CHR_END,STRAND
SEQ: ACUGCUACGUACGUACGUACG
SEQ_LEN: 21
SEQ_NAME: test
CHR: chr4
CHR_BEG: 24477
CHR_END: 24515
STRAND: +
---
Note that this replaces the SEQ_NAME, but if you want to avoid that just specify another key like SEQ_NAME2:
... | split_vals -k SEQ_NAME -K SEQ_NAME2,CHR,CHR_BEG,CHR_END,STRAND
SEQ: ACUGCUACGUACGUACGUACG
SEQ_LEN: 21
SEQ_NAME: test_chr4_24477_24515_+
SEQ_NAME2: test
CHR: chr4
CHR_BEG: 24477
CHR_END: 24515
STRAND: +
---
Martin Asser Hansen - Copyright (C) - All rights reserved.
May 2009
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
split_vals is part of the Biopieces framework.