You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am assembling a collection of primate genomes and have run into what looks to be a phasing issue, perhaps similar to #245.
My assemblies use HiFi, ONT, and Hi-C and my first has run to completion. When I view unitigs.hpc.noseq.gfa I get many regions that look like the figure below. I have loaded hicverkko.colors.tsv and coloured the graph as maternal/paternal. The coverage within bubbles looks good, but many are short ~3 Kbp. Given we have 64x HiFi and long ONT, N50 of 81.73Kbp with 50x coverage does this graph look correct? Shouldn't the ONT data resolve the many small ~3 Kbp bubbles?
The graph is correct, the issue is not the size of the heterozygous bubble but the homozygous nodes surrounding them. In this case, all the homozygous (2x coverage) nodes are very large, over 100kb in HPC space, mostly over 200kb. No ONT read can tell you how to connect the pairs of the short het nodes across that distance so they remain unphased. I expect if you looked at the Hi-C paths, the assembly should include these nodes with more or less random selection of a haplotype. Given how short and similar length the bubbles are, I think that is the correct assembly strategy as it'd potentially introduce small phasing errors while resolving the chromosomes.
The only large unphased pieces I see are utig4-1101 and utig4-5 but I suspect these are the X and Y, respectively. Again, in verkko v2.2.1 I expect these to have been assigned to the haplotypes in the final output.
Hi,
I am assembling a collection of primate genomes and have run into what looks to be a phasing issue, perhaps similar to #245.
My assemblies use HiFi, ONT, and Hi-C and my first has run to completion. When I view unitigs.hpc.noseq.gfa I get many regions that look like the figure below. I have loaded hicverkko.colors.tsv and coloured the graph as maternal/paternal. The coverage within bubbles looks good, but many are short ~3 Kbp. Given we have 64x HiFi and long ONT, N50 of 81.73Kbp with 50x coverage does this graph look correct? Shouldn't the ONT data resolve the many small ~3 Kbp bubbles?
Thanks.
232.zip
The text was updated successfully, but these errors were encountered: