Implement subgraph splitting for overflow resolution #388

cmyr · 2023-05-05T20:32:42Z

A port of the harfbuzz implementation; when overflows occur in subgraphs with multiple 32-bit roots (e.g. extension subtables sharing coverage tables or similar) we will iteratively select some subset of those roots and move them to another subgraph, duplicating and dependencies.

~~based on #386, which should go in first.~~

rsheeter · 2023-05-10T15:51:37Z

write-fonts/src/graph.rs

+
+        assert!(!overflows.is_empty());
+        self.debug_overflows(&overflows);
+        //FIXME: return some kind of error here


I suggest we might err on the side of issues over comments for this sort of thing, with possible exception for cases where we're going to fix it in O(days)

Oops this comment is out of date; we check the return and convert to an error in the dump_table fn.

rsheeter · 2023-05-10T15:54:35Z

write-fonts/src/graph.rs

-        n_overflows == 0
+
+        // now isolate spaces in a loop, until there are no more left:
+        let overflows = loop {


Can you write this as while !overflows.is_empty() instead of sticking a return in the middle? Hm. Or structure it as iteration?

The problem is that we aren't always going to continue until there are no overflows left; sometimes overflow is unavoidable.

The control flow here is:

check if there are overflows

if there are none, return

if there are some, try to make progress on resolving them

if we don't make progress, break the loop and print diagnostics

This should always terminate, since we exit the loop if either there are no overflows, or if there are overflows but there are no splittable subgraphs remaining.

The comment above reads as if we are going to continue until there are none so that might need a fix then. IMHO while primary-exit-condition is still nicer, even if there is another path toward failure.

What about using Result<(), SomeErrType> so you can use normal control flow, e.g. try_splitting_subgraphs(...)??

rsheeter · 2023-05-10T15:55:13Z

write-fonts/src/graph.rs

                }
            }
        }
        result
    }

+    fn debug_overflows(&self, overflows: &[Overflow]) {
+        let (parents, children): (HashSet<_>, HashSet<_>) =


if debug is not enabled early exit

also should be behind log feature per other conversation? - np if that's out of scope here though.

yea we don't have a log feature yet, will revisit that when I make that change (opened #397 to track this)

It looks like the early exit for log disabled - avoiding the iterations and zips to produce logs we never write - was not done?

We don't have a log feature, so there's nothing to use to gate this code. In any case this exists right now as a utility for me to use while I am implementing the splitting/promotion stuff.

rsheeter · 2023-05-10T15:57:21Z

write-fonts/src/graph.rs

+        for overflow in overflows {
+            let child_space = self.nodes[&overflow.child].space;
+            // we only isolate subgraphs in wide-space
+            if !child_space.is_custom() || self.num_roots_per_space[&child_space] <= 1 {
+                continue;
+            }


Optional, looks like this would fit very naturally as .iter().filter()?

It doesn't feel like a clear improvement, because i want to reuse the child_space binding outside of the filter. 🤷

for overflow in overflows.iter().filter(|overflow| { let child_space = self.nodes[&overflow.child].space; // we only isolate subgraphs in wide-space !(!child_space.is_custom() || self.num_roots_per_space[&child_space] <= 1) }) { let child_space = self.nodes[&overflow.child].space; let root = self.find_root_of_space(overflow.child); debug_assert_eq!(self.nodes[&root].space, child_space); to_isolate .entry(child_space) .or_insert_with(HashSet::new) .insert(root); }

rsheeter · 2023-05-10T15:59:35Z

write-fonts/src/graph.rs

+
+        if to_isolate.is_empty() {
+            return false;
+        }


At a glance it looks like you could write this as to_isolate = overflows.iter.filter.map.collect?

The reason I do this in a loop is because I want to use the 'entry' method on HashMap, which doesn't work with collect. Basically I'm finding multiple roots per space.

rsheeter · 2023-05-10T16:02:10Z

write-fonts/src/graph.rs

+        //  │ └─8─┘ │                     / \     / \
+        //  │       │    (cov tables)    7'  8'  7   8
+        //  └───7───┘
+        //


Diagram appreciated, without it the test would be inscrutable

Agreed, I found it impossible to come up with good test cases without drawing them out first.

In the case that a subgraph in 32-bit space contains an overflow, we will now attempt to select some set of roots in that subgraph and move them to a new space, duplicating any children as required.

dfrg · 2023-05-10T16:35:45Z

QQ: ObjectIds are allocated with a global usize counter. Should we change this to an explicit u64 (or allocate ids locally in the graph) to avoid potential overflows if someone does decide to run this on a 32-bit system?

cmyr · 2023-05-10T16:45:30Z

QQ: ObjectIds are allocated with a global usize counter. Should we change this to an explicit u64 (or allocate ids locally in the graph) to avoid potential overflows if someone does decide to run this on a 32-bit system?

Hmm... I would expect the process to only be running for a specific compilation, which even in the case of a very large multi-access font it's hard to imagine there being 4.2e9 unique objects? And in that case I would definitely expect us to be on a 64-bit system. That said, it wouldn't hurt...

cmyr force-pushed the hb-repacker branch from 794d8dc to 916a6f2 Compare May 8, 2023 15:20

Base automatically changed from hb-repacker to main May 8, 2023 15:49

cmyr force-pushed the isolate-subgraph branch 4 times, most recently from a122fdf to 357be56 Compare May 9, 2023 14:46

rsheeter reviewed May 10, 2023

View reviewed changes

rsheeter approved these changes May 10, 2023

View reviewed changes

Implement subgraph splitting

b472110

In the case that a subgraph in 32-bit space contains an overflow, we will now attempt to select some set of roots in that subgraph and move them to a new space, duplicating any children as required.

cmyr force-pushed the isolate-subgraph branch from 357be56 to b472110 Compare May 10, 2023 16:27

cmyr mentioned this pull request May 10, 2023

Use a u64 for ObjectId #398

Closed

cmyr merged commit 812ceb6 into main May 10, 2023

cmyr deleted the isolate-subgraph branch May 10, 2023 17:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement subgraph splitting for overflow resolution #388

Implement subgraph splitting for overflow resolution #388

cmyr commented May 5, 2023 •

edited

Loading

rsheeter May 10, 2023

cmyr May 10, 2023

rsheeter May 10, 2023

cmyr May 10, 2023

rsheeter May 12, 2023

rsheeter May 10, 2023

rsheeter May 10, 2023

cmyr May 10, 2023

rsheeter May 12, 2023

cmyr May 22, 2023

rsheeter May 10, 2023

cmyr May 10, 2023

rsheeter May 10, 2023

cmyr May 10, 2023

rsheeter May 10, 2023

cmyr May 10, 2023

dfrg commented May 10, 2023

cmyr commented May 10, 2023

Implement subgraph splitting for overflow resolution #388

Implement subgraph splitting for overflow resolution #388

Conversation

cmyr commented May 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dfrg commented May 10, 2023

cmyr commented May 10, 2023

cmyr commented May 5, 2023 •

edited

Loading