Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "pack_untracked_polytomies" ability to draw_svg() #3012

Merged
merged 1 commit into from
Oct 8, 2024

Conversation

hyanwong
Copy link
Member

@hyanwong hyanwong commented Oct 8, 2024

This allows us to visualise trees with large polytomies (such as from tsinfer or sc2ts) by using the "tracked_samples" functionality of a constructed tskit.Tree. Specifically, we pack all lineages in a polytomy that containing entirely "untracked" nodes, to allow us to focus on only the tracked nodes.

The drawing._postorder_tracked_minlex_traversal() function provides a way to create a postorder traversal that puts all the non tracked-node lineages on the right hand side of the polytomy, which is required for packing. At the moment I have kept this as a private function, and the new (undocumented) functionality is triggered using e.g.

import os
import numpy as np
import tskit
import tszip

ts_dir = "../data"
filename = os.path.join(ts_dir, "biotite_nj_long_arg_v7_clustloc-mrm_2-rw_10-mgs_10-2020-10-29")
ts = tszip.decompress(filename + ".ts.tsz")

mut_labels = {
    m.id: f"{ts.site(m.site).ancestral_state if m.parent == tskit.NULL else ts.mutation(m.parent).derived_state}{int(ts.site(m.site).position)}{m.derived_state}"
    for m in ts.mutations()
}

pango = "B.1.240"
tracked_nodes = [u for u in ts.samples() if ts.node(u).metadata.get("Viridian_pangolin") == pango] # could also use ti.pango_lineage_samples[pango]
tree = ts.first(tracked_samples=tracked_nodes)
order = list(tskit.drawing._postorder_tracked_minlex_traversal(tree, collapse_tracked=True))
print(
    f"{len(order)} nodes in subtree.",
    f"{len(tracked_nodes)} nodes in cyan are {pango}.",
    f"Showing first of {ts.simplify(order).num_trees} tree(s)."
)
tree.draw_svg(
    time_scale="rank",
    y_axis=True,
    order=order,
    canvas_size=(650, 600),
    size=(600, 600),
    node_labels={u: ts.node(u).metadata.get("Viridian_pangolin", "") for u in order if u not in tracked_nodes},
    mutation_labels=mut_labels,
    all_edge_mutations=True,
    symbol_size=4,
    pack_untracked_polytomies=True,
    style=(
        "".join(f".n{u} > .sym {{fill: cyan}}" for u in tracked_nodes + [39]) +
        ".lab.summary {font-size: 9px}" + 
        ".polytomy {font-size: 10px}" +
        ".mut .lab {font-size: 10px}"
        ".plotbox {transform: translateX(40px)}"
        ".y-axis .lab {font-size: 12px}"
    ),
)
Screenshot 2024-10-08 at 13 52 57

@hyanwong hyanwong changed the title Add "pack_polytomy" ability to draw_svg() Add "pack_untracked_polytomies" ability to draw_svg() Oct 8, 2024
@hyanwong
Copy link
Member Author

hyanwong commented Oct 8, 2024

Fixes #3011 . I'm not sure what the API should be yet. E.g. when you specify pack_untracked_polytomies=True, it only really makes sense to set the node order as _postorder_tracked_minlex_traversal (or some version thereof). However, there are some parameters you might want to pass to this function, e.g. to start at a different root, or to specify collapse_tracked=True, so it's not quite as easy as

order="tracked_minlex"

(and in addition, it's not clear to me whether this make sense as an ordering for a whole tree sequence, or just for a single tree).

Copy link

codecov bot commented Oct 8, 2024

Codecov Report

Attention: Patch coverage is 97.05882% with 2 lines in your changes missing coverage. Please review.

Project coverage is 89.84%. Comparing base (586a81e) to head (51150ab).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
python/tskit/drawing.py 97.05% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3012      +/-   ##
==========================================
+ Coverage   89.82%   89.84%   +0.01%     
==========================================
  Files          29       29              
  Lines       32030    32093      +63     
  Branches     6207     6230      +23     
==========================================
+ Hits        28772    28833      +61     
  Misses       1859     1859              
- Partials     1399     1401       +2     
Flag Coverage Δ
c-tests 86.69% <ø> (ø)
lwt-tests 80.78% <ø> (ø)
python-c-tests 89.05% <ø> (ø)
python-tests 99.00% <97.05%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
python/tskit/drawing.py 99.23% <97.05%> (-0.13%) ⬇️

@jeromekelleher
Copy link
Member

Let's not worry too much about final API and get something in there for now that we can start playing with.

@hyanwong
Copy link
Member Author

hyanwong commented Oct 8, 2024

Let's not worry too much about final API and get something in there for now that we can start playing with.

Yep, 100% agree, which is why I've left it all undocumented

@hyanwong hyanwong force-pushed the polytomy branch 2 times, most recently from ff4860e to 2c2c502 Compare October 8, 2024 12:39
@hyanwong
Copy link
Member Author

hyanwong commented Oct 8, 2024

After discussion with Jerome, it seemed useful to be able to collapse a subtree if (say) > 99% of the samples within it were tracked samples.

It would be relatively easy to change the collapse_tracked parameter to expect a proportion (e.g. 0.99), specifying the collapsing threshold. By default, None would mean "never collapse" and 1 (or True) would mean only collapse if all samples under a node are tracked samples.

@jeromekelleher
Copy link
Member

SGTM

@hyanwong
Copy link
Member Author

hyanwong commented Oct 8, 2024

Cool, done. Thanks.

@jeromekelleher jeromekelleher added the AUTOMERGE-REQUESTED Ask Mergify to merge this PR label Oct 8, 2024
@mergify mergify bot merged commit 7320290 into tskit-dev:main Oct 8, 2024
21 checks passed
@mergify mergify bot removed the AUTOMERGE-REQUESTED Ask Mergify to merge this PR label Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants