Skip to content
Ryan Wick edited this page Sep 7, 2015 · 11 revisions

In Bandage, a path is a means of specifying a sequence which extends through multiple nodes.

Syntax

Path syntax

Note the following: * The node names must be exact and end with a '+' or '-' (see single vs double node style). * Node positions use a 1-based index. I.e. position 1 is the first base in a node's sequence and the position of a node's last base is equal to the length of its sequence. * A path is only valid if the necessary edges exist in the graph to connect the sequences in the specified order.

Examples:

  • 9+, 12-
    • The entirety of node 9+, followed by the entirety of node 12-
  • (51) 9+, 12-
    • From position 51 to the end of node 9+, followed by the entirety of node 12-
  • (51) 9+, 12- (87)
    • From position 51 to the end of node 9+, followed by the first 87 bases of node 12-
  • 9+, 12-, 8+, 12-, 3-
    • This path contains a loop and includes the sequence for node 12- twice.

Exporting path sequences

Simple paths

In Bandage, you can easily output path sequences for selected nodes.

Unambiguous path selection

If the selected nodes form an unambiguous path, then you copy the sequence to clipboard or save it to file using the options in Bandage's 'Output' menu.

Copy/save node path sequence

The resulting path sequence will contain the entirety of the constituent nodes.
Complex paths

If you wish to export the sequence for a more complex path (loops, start/end positions, etc.), the above approach will not work. Instead, you must select 'Specify exact path for copy/save' from the 'Output' menu.

Specify exact path

This will open a new window where you can define a path using the syntax described above. As a shortcut, you can double-click on a node in the visualisation to add it to the path. Bandage will show your specified path by shading it in the visualisation.

Complex path

Overlaps

In graphs made by some assemblers, nodes connected by an edge have overlapping sequences. Bandage will remove this overlap when creating a path sequence. Therefore, a path sequence may be shorter than the sequences of its constituent nodes. See assembler differences.

Circular paths

In the 'Specify exact path' window, there is an tick box for 'Circular path'. A circular path forms a loop where the sequence at the end directly leads into the sequence at the beginning. This is useful for extracting circular sequences from an assembly graph, such as bacterial chromosomes or plasmids.