-
Notifications
You must be signed in to change notification settings - Fork 2
Genotype Specification
This document describes the logic used by gnomic to compute full inherited genotypes from differential genotypes.
- Mutations are applied sequentially from left to right.
- A genotype with a parent genotype is functionally equivalent to a genotype without a parent genotype but with all the changes from the other genotype applied first.
Two inverse insertion and deletion mutations result in a return to the original state. If one or both of the mutations are applied with the multiple flag, the change resolution becomes more complex.
Change 1 | Change 2 | Effect |
---|---|---|
+A | +A | |
-A | -A | |
+A | -A | |
-A | +A | |
B>A | B>A | |
+B | B>A | A |
-B | B>A | 'B>A' or '' [ambiguous; potentially invalid; needs review] |
+B | B>>A | B>>A |
-B | B>>A | 'B>>A' or '' [ambiguous; potentially invalid; needs review] |
-A | ++A | ++A [provisional] |
+A | --A | --A [provisional] |
--A | +A | +A |
++A | -A | [ambiguous; invalid; needs review] |
--A | ++A | ++A [provisional] |
++A | --A | [provisional] |
Fusions and feature sets are compound elements of the genotype. Mutations are able to change these features partially.
Fusions can contain features, feature sets and implicit sub-fusions (e.g. A:B is a sub-fusion of A:B:C, but A:C is not). These sub-elements can be cut out from a fusion or replaced with another element.
Change 1 | Change 2 | Effect |
---|---|---|
+A:B:C | -A:B | +C |
-A:B:C | +A:B | -C |
+A:B:C | -B | +A:C |
-A:B:C | +B | -A:C |
+A:{B C} | -{B C} | +A |
-A:{B C} | +{B C} | -A |
+A:B:C | B>D | +A:D:C |
+A:B:C | B>D:E | +A:D:E:C |
+A:B:C | B:C>D:E | +A:D:E |
+A:B:C | B:C>{D E} | +A:{D E} |
Feature sets can contain features, fusions or implicit feature subsets (e.g. {A B} and {B C} are feature subsets of of {A B C}, but {A D} or {A C} is not). Currently, elements in a feature set are assumed to be ordered ([provisional]). One-element feature sets are not converted into single features as opposed to fusions.
Change 1 | Change 2 | Effect |
---|---|---|
+{A B C} | -B | +{A C} |
-{A B C} | +B | -{A C} |
+{A B} | -B | +{A} |
-{A B} | +B | -{A} |
+{A B C} | -{A B} | +{C} [provisional] |
-{A B C} | +{A B} | -{C} [provisional] |
+{A B C} | -{B} | +{A C} [provisional] |
-{A B C} | +{B} | -{A C} [provisional] |
+{A B:C} | -B:C | +{A} |
-{A B:C} | +B:C | -{A} |
+{A B} | B>C | +{A C} |
+{A B} | B>C:D | +{A C:D} |
+{A B:C} | B:C>D | +{A D} |
+{A B} | B>{C D} | +{A C D} [provisional] |
+{A B C} | {B C}>{D E} | +{A D E} [provisional] |
- defining feature sets as an unordered collection of elements. If such a definition would be used, the following operations would make sense: +{A B C} -{A B} is +{C}; +{A B C} -{C A} is +{B}.
- distinguishing between -A and -{A} operations
- feature set in a feature set issue: +{A B} B>C:{D E} = ?. How to resolve? {A C:{D E}} is not allowed by grammar currently.
If a feature set contains a fusion or vice versa, the changes are also applied to that inner compound element.
Change 1 | Change 2 | Effect |
---|---|---|
+{A B:C} | -B | +{A C} |
+{A B:C} | B>D:E | +{A D:E:C} |
+A:{B C} | C>D | +A:{B D} |
+A:{B C} | C>D:E | +A:{B D:E} |
+A:{B}:C | -B | +A:{}:D [provisional] |
Repeated changes such as +A, +A
or -A -A
are ambiguous and may be invalid, raising an exception. [needs review]
Each integrative mutation may optionally have a locus. (Mutations that insert plasmids or set a phenotype are non-integrative and may not have a locus).
A locus acts as a namespace controlling interactions between mutations.
Consider these examples:
-
A>B B>C
: The loci of the two mutations in this genotype are identical as they are bothNone
. Therefore, both mutations are allowed to interact with each other and the final genotype becomesA>C
. [potentially ambiguous; needs review] -
A>B B>>C
: The loci of the two mutations in this genotype are identical as they are bothNone
. Therefore, both mutations are allowed to interact with each other. As the second mutation is a multiple insertion mutation, it is also applied on its own. The final genotype becomesA>C B>>C
. [potentially ambiguous; needs review] -
A@locus-1>B B@locus-2>C
: Since the loci of the two mutations are different, they are not allowed to interact with each other and the final genotype isA@locus-1>B B@locus-2>C
. -
A@>B B@locus-2>C
: Since the loci of the two mutations are different (None
and'locus-2'
), they are not allowed to interact with each other and the final genotype isA@>B B@locus-2>C
.
...
-
integrative mutation: A mutation with a specified change on the genome. This includes non-integrated plasmids (e.g.
(p1)
), feature variants (e.g.A+
), and markers (e.g.+x::A+
). -
multiple insertion mutation and multiple deletion mutation: A mutation with the multiple flag (e.g.
++A
[provisional],--A
[provisional],A>>B
)