Removing samples not involved in recombination #2703
-
Probably an easy question, but has anyone thought of an algorithm for simplifying a tree sequence to remove only those samples not involved in recombination? Another way to think about this would be to remove samples but retain the same number of trees and breakpoints (that's not quite the same, but is a strict requirement of such an algorithm). I'm unclear if this would require keeping (some) unary nodes in then tree sequence. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
You could do something like this: tree = ts.first()
untouched_by_recomb = []
for sample in ts.samples():
u = sample
affected_by_recomb = False
while u != -1:
e = ts.edge(tree.edge(u))
if e.left != 0 or e.right != ts.sequence_length:
affected_by_recomb = True
break
u = tree.parent(u)
if not affected_by_recomb:
untouched_by_recomb.append(sample) The idea is that a sample is not involved in recombination if it has a complete tree path all the way back to root, which seems like a reasonable definition, but perhaps its too strong? I'm sure you could make this implementation more efficient, but this should be a good starting point? |
Beta Was this translation helpful? Give feedback.
You could do something like this:
The idea is that a sample is not involved in recombination if it has a complete tree path all the way back to root, which seems like a reasonable definition, but perhaps its too strong?
I'm sure you could make this implementation more efficient, but this should be a g…