Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repeated builds are not deterministic #647

Closed
anthrotype opened this issue Dec 13, 2023 · 22 comments · Fixed by googlefonts/fontations#756
Closed

repeated builds are not deterministic #647

anthrotype opened this issue Dec 13, 2023 · 22 comments · Fixed by googlefonts/fontations#756
Assignees
Labels
bug Something isn't working

Comments

@anthrotype
Copy link
Member

anthrotype commented Dec 13, 2023

ok the head.modified changes every time (and checksums would also change accoridngly), but I would not expect other things to differ when I compare ttx dumps of font files that were freshly built from the same source file using the same copy of fontc.

It turns out there are some differences in things like GDEF and GPOS (as noted in #644 (comment), this one possible regression introduced with #642), as well as in GSUB (in the FeatureParamsStylisticSet nameIDs).

Also when doing ttx -l to show the table directory, the gvar table seems to have different checksums each time round, even though the length is the same.. But diffing its ttx repesentations gives no differences, so it might be the bytes themselves are serialized differently each time, for some reasons.

@anthrotype
Copy link
Member Author

anthrotype commented Dec 13, 2023

forgot to mention - I was able to notice these diffs on repeated builds of GoogleSans.designspace, I didn't try other fonts. I could not detect them, for example, when building Oswald.glyphs

@cmyr
Copy link
Member

cmyr commented Dec 13, 2023

In the past I have seen things like this arise when using collections that provide a non-deterministic iteration order (e.g. hashmaps) so that would be one thing I would look into first. I have done some work in the past in this regard (for instance #561) and it's very possible there are other places where we've missed this.

@anthrotype
Copy link
Member Author

I think here it's also to do with the fact that we split some tasks in parallel threads that may complete at different times and then we gather their result in some other collection where the order does matter

@cmyr
Copy link
Member

cmyr commented Dec 13, 2023

yes that is totally possible as well.

@rsheeter
Copy link
Contributor

Good catch. Can we reproduce in a test with our test files? - that should make it easy to fix and keep fixed.

@anthrotype
Copy link
Member Author

I don't think our test files contain enough kerning pairs to trigger this..

@rsheeter
Copy link
Contributor

Let's fix this after we land #644. We can add a test file with lots of kerns or a knob to help it trigger with fewer kerns if necessary.

@rsheeter rsheeter self-assigned this Dec 13, 2023
@rsheeter
Copy link
Contributor

In IM @anthrotype noted "the GPOS vary in size between 106684 bytes and 4384362, an order of magnitude different.."; that's mildly alarming

@rsheeter
Copy link
Contributor

rsheeter commented Dec 13, 2023

Reproduction:

$ rm *.ttxl && for i in {0..9}; do rm -rf build/ && cargo run ../googlesans/source/GoogleSans/GoogleSans.designspace && ttx -l build/font.ttf > run$i.ttxl; done
$ grep -H GPOS run?.ttxl
run0.ttxl:    GPOS  0xE96ECCD8   4384230     18876
run1.ttxl:    GPOS  0x45C3186D   4384236     18880
run2.ttxl:    GPOS  0xF17589DD   4384230     18876
run3.ttxl:    GPOS  0xB16C1453    106684     15152
run4.ttxl:    GPOS  0xA721CEE8   4384236     18876
run5.ttxl:    GPOS  0x1E2BB2BF   4384230     18880
run6.ttxl:    GPOS  0xB8849B8C   4384236     18876
run7.ttxl:    GPOS  0xB16C1453    106684     15152
run8.ttxl:    GPOS  0x2B12F05E   4384194     18876
run9.ttxl:    GPOS  0x0E7DA5D9   4384188     18876

@rsheeter
Copy link
Contributor

rsheeter commented Dec 14, 2023

#647 (comment) repro is fixed, new repro:

f=../googlesans/source/GoogleSans/GoogleSans.designspace; \
export SOURCE_DATE_EPOCH=$(date +%s); \
cargo build --release && rm -rf build/ \
&& target/release/fontc $f && cp build/font.ttf build/first.ttf \
&& target/release/fontc $f && cp build/font.ttf build/second.ttf \
&& diff -u <(fonttools ttx -o - -l build/first.ttf | tail -n +2) <(fonttools ttx -o - -l build/second.ttf | tail -n +2)

# GSUB and gvar report same length and offset but different contents

$ diff -u <(fonttools ttx -o - -t GSUB build/first.ttf | tail -n +2) <(fonttools ttx -o - -t GSUB build/font.ttf | tail -n +2)
# oh no, we're assigning inconsistent name ids, e.g.
-            <UINameID value="263"/>  <!-- Text Medium -->
+            <UINameID value="260"/>  <!-- Medium -->

@rsheeter
Copy link
Contributor

The UINameID diff is fixed, table is now the same length, same offset, but binary different run to run. Steps to reproduce:

f=../googlesans/source/GoogleSans/GoogleSans.designspace; \
export SOURCE_DATE_EPOCH=$(date +%s); \
cargo build --release && rm -rf build/ \
&& target/release/fontc $f && cp build/font.ttf build/first.ttf \
&& target/release/fontc $f && cp build/font.ttf build/second.ttf \
&& diff -u <(fonttools ttx -o - -l build/first.ttf | tail -n +2) <(fonttools ttx -o - -l build/second.ttf | tail -n +2)

To dig further I have a fonttools hack. To view the code see fonttools/fonttools@main...rsheeter:fonttools:main. To use it:

# Example assumes your currend directory is the root of fontc

# onetime setup
$ (cd .. && git clone [email protected]:rsheeter/fonttools.git fonttools_rs)
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -e ../fonttools_rs/
$ pip install absl-py

# Run the steps to reproduce (start of this comment). Should report GSUB mismatch, checksums differ, same length and offset
$ python ../fonttools_rs/Snippets/compare_table.py build/first.ttf build/second.ttf --noprint_eq
# should print a bunch of things with different offsets, abbreviated example follows
font["GSUB"].table.LookupList.Lookup[1383].SubTable[0].ExtSubTable.SubstLookupRecord[0].offset Value 86398 != 114938
font["GSUB"].table.LookupList.Lookup[1391].SubTable[1].ExtSubTable.SubstLookupRecord[0].offset Value 86838 != 98408
font["GSUB"].table.LookupList.Lookup[1395].SubTable[0].ExtSubTable.ChainSubRuleSet[0].ChainSubRule[2].SubstLookupRecord[0].offset Value 94840 != 113000
font["GSUB"].table.LookupList.Lookup[1395].SubTable[0].ExtSubTable.ChainSubRuleSet[0].ChainSubRule[3].SubstLookupRecord[0].offset Value 94854 != 113014
font["GSUB"].table.LookupList.Lookup[1395].SubTable[0].ExtSubTable.ChainSubRuleSet[1].ChainSubRule[0].SubstLookupRecord[0].offset Value 94784 != 112944
font["GSUB"].table.LookupList.Lookup[1395].SubTable[0].ExtSubTable.ChainSubRuleSet[1].ChainSubRule[1].SubstLookupRecord[0].offset Value 94798 != 112958
font["GSUB"].table.LookupList.Lookup[1408].SubTable[0].ExtSubTable.ChainSubClassSet[1].ChainSubClassRule[22].SubstLookupRecord[0].offset Value 79246 != 87712

@rsheeter rsheeter assigned cmyr and unassigned rsheeter Dec 15, 2023
@rsheeter
Copy link
Contributor

Passing to @cmyr to dig into packing

@anthrotype
Copy link
Member Author

googlefonts/fontations#738 fixed indeterminism in GSUB table, however the gvar table is still changing each time fontc is run on GS (same length, different bytes/checksum) see #656

@anthrotype
Copy link
Member Author

to clarify my previous comment, only the gvar table bytes are different, its high-level content as dumped with ttx is the same at each re-run.

anthrotype added a commit to googlefonts/fontations that referenced this issue Dec 20, 2023
Meant to reproduce the issue discussed at googlefonts/fontc#647

If we are lucky enough the CI should fail, otherwise flip the coin again and you'll see...
anthrotype added a commit that referenced this issue Jan 12, 2024
@anthrotype anthrotype reopened this Jan 12, 2024
@anthrotype
Copy link
Member Author

looks like GDEF and GPOS still are not deterministic... :(

https://github.com/googlefonts/fontc/actions/runs/7501084544/job/20421031213?pr=674#step:12:32

they appeared to have changed length (and thus checksum) between the first and second build run

@anthrotype
Copy link
Member Author

oh dear.. I can't reproduce locally. No idea what is going on :(

@anthrotype
Copy link
Member Author

baging my head, still can't repro, but I can say that main also fails with the same issue, not just after #674 so the clippy fixes are unrelated to the GS repeatable build fail.

I just re-run the CI jobs that had previously passed on main after #673 got merged and now they failed:
https://github.com/googlefonts/fontc/actions/runs/7501018748/job/20428708156

@anthrotype
Copy link
Member Author

This is the shell script I'm using to attempt to reproduce locally what the CI is doing, I tried on both my glinux workstation and on my macbook pro, both are using the latest rust toolchain 1.75.0, I did cargo update, even cargo clean, I still can't reproduce, this script always ends without error and the files are identical.
WTH is going on?!

#!/bin/bash

# clean up
rm -rf first-roman.ttf second-roman.ttf first-italic.ttf second-italic.ttf
rm -rf build

echo "Installing fontc"
(cd fontc && cargo install --path .)
which fontc

echo "Building the first roman"
time fontc ../googlesans/source/GoogleSans/GoogleSans.designspace
mv build/font.ttf ./first-roman.ttf

echo "Building the first italic"
time fontc ../googlesans/source/GoogleSans/GoogleSans-Italic.designspace
mv build/font.ttf ./first-italic.ttf

echo "Building the second roman"
time fontc ../googlesans/source/GoogleSans/GoogleSans.designspace
mv build/font.ttf ./second-roman.ttf

echo "Building the second italic"
time fontc ../googlesans/source/GoogleSans/GoogleSans-Italic.designspace
mv build/font.ttf ./second-italic.ttf

echo "diff {first,second}-roman.ttf"
diff -u <(ttx -l first-roman.ttf | tail -n +2) <(ttx -l second-roman.ttf | tail -n +2)
cmp first-roman.ttf second-roman.ttf

if [ $? -ne 0 ]; then
  echo "error: non repeatable build :("
fi

echo "diff {first,second}-italic.ttf"
diff -u <(ttx -l first-italic.ttf | tail -n +2) <(ttx -l second-italic.ttf | tail -n +2)
cmp first-italic.ttf second-italic.ttf

if [ $? -ne 0 ]; then
  echo "error: non repeatable build :("
  exit 1
else
  echo "success! :)"
  exit 0
fi

@anthrotype
Copy link
Member Author

i'm going crazy chasing this heisenbug 🤯

In #676 I added a shell script (so that I can try to run the same commands locally and on CI) that builds a font twice and diff the ttx -l and cmp the binary fonts but now I can't seem to reproduce the failure any more on the CI even (locally, I never managed to repro).

@anthrotype
Copy link
Member Author

and to add insult to injury, my repeatable-builds.sh script is now failing with Oswald, complains the gvar bytes changed!

$ resources/scripts/repeatable-builds.sh ../OswaldFont/sources/Oswald.glyphs
$ which fontc
/usr/local/google/home/clupo/.cargo/bin/fontc
$ fontc ../OswaldFont/sources/Oswald.glyphs -o build/Oswald.ttf
$ fontc ../OswaldFont/sources/Oswald.glyphs -o build/Oswald#1.ttf
$ ttcmp build/Oswald.ttf build/Oswald#1.ttf
build/Oswald.ttf build/Oswald#1.ttf differ: byte 177, line 3
--- /dev/fd/63	2024-01-12 16:03:31.330910933 +0000
+++ /dev/fd/62	2024-01-12 16:03:31.330910933 +0000
@@ -10,7 +10,7 @@
     cmap  0x8D57590D      6212     38860
     fvar  0x8D346992        84     45072
     glyf  0xDEF86C47     43674     45156
-    gvar  0x50120D8C     71016     88832
+    gvar  0xD0128D8C     71016     88832
     head  0x1ACDE23F        54    159848
     hhea  0x08AE0978        36    159904
     hmtx  0xFFDF6CD1      3532    159940
Error: build/Oswald.ttf and build/Oswald#1.ttf are different

But only sometimes.. 😢

$ resources/scripts/repeatable-builds.sh ../OswaldFont/sources/Oswald.glyphs
Success: build/Oswald.ttf and build/Oswald#1.ttf are identical

@rsheeter rsheeter self-assigned this Jan 12, 2024
@rsheeter
Copy link
Contributor

I can reproduce GS build instability locally as follows:

rm -f *.ttxl threads*.svg && for i in {0..4}; do rm -rf build/ && cargo run ../googlesans/source/GoogleSans/GoogleSans.designspace --emit-timing && cp build/threads.svg ./threads_$i.svg && ttx -l build/font.ttf > run$i.ttxl && grep GPOS *.ttxl; done

Looking at the threads_#.svg for a failure (small GPOS) shows we are running gather BE kerning before we generate it:

0ms (0.00%) Be(GatherBeKerning) run at 7527ms done at 7528ms

195ms (1.81%) Be(KernFragment(0)) run at 7690ms done at 7886ms
...etc...

@anthrotype
Copy link
Member Author

anthrotype commented Jan 15, 2024

I managed to isolate the gvar indeterminism, it's the shared tuples array that may end up being sorted differently each time because we are using a HashMap to count how many times a given tuple is shared among glyph tuple variations, and then use sort_unstable_by_key to take the max, but the sorting method does not preserve the order of equal elements (and HashMap also is inherently unsorted so iteration order isn't guaranteed either). If I use IndexMap::sort_by_key the order is stable and builds are reproducible again. I'll send a PR shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants