This repository was archived by the owner on Mar 20, 2020. It is now read-only.
This repository was archived by the owner on Mar 20, 2020. It is now read-only.
Backend Graph Specification #4
Closed
Description
((Updated by Josiah 2019-07-10))
((Updated by Josiah 2019-08-15: Removed PathIndex and Slices))
Version 1 - Basics
- GraphGenome is the top level object which holds references to:
- The set of all nodes in the Graph by uuid
- Set of all Paths in the Graph by accession name
- One path per accession (contiguous). Each chromosome = ?
- Method for topological sort to present graph nodes in a consistent order every time
- Paths contain
- One unique accession (a.k.a. specimen) name
- List of NodeTraversals in order (reverse lookup)
- Can visit the same node multiple times
- Each node traversal is tracked by NodeTraversal and annotations (epigenetic state, etc.) is tied to a NodeTraversal, not the Node itself.
- Ex. 5+ 9+ 10+ 50+ 23- 78+
- Node contains:
- Sequence (optional)
- uuid to distinguish from other nodes of same sequence
- Set of Node Traversals intersecting the node (reverse lookup)
- You can ask each Traversal what is the next "downstream" or "upstream" node. Meaning you can build a set of neighboring nodes, including self for duplications.
- Ex. { (Path5, Order 67, +) (Path10, Order 160, -), (Path5, Order 2756, +) }
- NodeTraversal
- Node - being traversed
- Path - the specimen doing the traversing
- strand: either + or - represents reverse complements and inversions
- Order - order in which the traversal occurs in this path
- e.g. 5 would be the 5th node visited by the Path and N+1 is the downstream node