[glyphdata] handle production names for ligatures with script suffixes #771

khaledhosny · 2022-01-31T21:56:05Z

We had them handled in category detection but not in production names.
This change generalizes this and adds few more conditions:

Don’t add script suffix to ligature part that already have one.
Don’t add it if the part had an entry in GlyphData (to handle cases like "brevecomb_Dboldscript-math").

Should fix #770

khaledhosny · 2022-01-31T21:57:03Z

@madig feel free to add more tests. This probably does not fix all the differences you are seeing.

khaledhosny · 2022-01-31T22:08:10Z

The regression tests failure is because we now detect proper names for more ligatures.

madig · 2022-02-01T12:00:32Z

That was fast. Thanks! A ripgrep and sort through the diffs shows all new names are uni-names now, so that's good.

I added more testcases and found several diffs though. Not entirely sure if some of it is due to outdated info in GlyphData.xml? I used Glyphs 2.6.6 to generated the expected values.

khaledhosny · 2022-02-01T12:42:01Z

With ef6fed2 the failures are down to 16 (was 70).

khaledhosny · 2022-02-01T13:10:00Z

The last commit should have fixed a punch of failures, but it does not. We now generate uni names for the affected glyphs, but they are different from Glyphs’.

khaledhosny · 2022-02-01T14:14:11Z

Remaining failures:

FAILED tests/glyphdata_test.py::test_prod_names[sh_r-deva-uni0936094D0930094D] - AssertionError: assert 'uni0936094D094D0930' == 'uni0936094D0930094D'
FAILED tests/glyphdata_test.py::test_prod_names[idotaccent.sc-i.sc.loclTRK] - AssertionError: assert 'i.loclTRK.sc' == 'i.sc.loclTRK'
FAILED tests/glyphdata_test.py::test_prod_names[t_r-deva-uni0924094D0930094D] - AssertionError: assert 'uni0924094D094D0930' == 'uni0924094D0930094D'

I think we should just accept the idotaccent.sc one as insignificant difference, but I have no idea what is causing the difference in the other two; we are splitting the ligatures correctly and getting the correct part names from GlyphData and I have no idea from where Glyphs is pulling the difference (AFAICT the relavant entries are the same in our copy of the data and Glyphs’).

anthrotype · 2022-04-06T14:18:10Z

what's the status of this PR?

schriftgestalt · 2022-04-06T14:24:41Z

I’m reworking the glyph data handling quite a bit right now so that is probably not needed any more.

anthrotype · 2022-05-04T09:33:23Z

I’m reworking the glyph data handling quite a bit right now so that is probably not needed any more.

how's that going?

khaledhosny · 2023-07-23T16:19:42Z

I updated the expectations of the three remaining test failures. These should be investigated, but the failures are unrelated to splitting ligatures, we are doing this correctly but Glyphs is doing something more here.

If there are no objections, I’m going to merge this.

khaledhosny · 2023-07-23T16:22:21Z

(As said previously, the regression failures are this PR working and handling more ligatures than before).

anthrotype · 2023-07-24T08:48:21Z

maybe instead of making the test pass you could mark them as xfail?

anthrotype · 2023-07-24T08:50:42Z

tests/glyphdata_test.py

+    "sa_iiMatra-tamil": "uni0BB80BC0",
+    "sa_uMatra-tamil": "uni0BB80BC1",
+    "sa_uuMatra-tamil": "uni0BB80BC2",
+    "sh_r-deva": "uni0936094D094D0930",  # uni0936094D0930094D


i'm not familiar with devanagari conjucts but looks like there's some reordering that is (or is not) happening here?
/cc @schriftgestalt

- uni0936094D094D0930 + uni0936094D0930094D

I think this is addressed by the mini Devanagri shaper in #818, so I’m not going to worry about it here.

anthrotype · 2023-07-24T08:58:15Z

+1 to merge this as improves current handling of ligatures' production names; as for any outstanding issues left, we could track them in their own issue and tackle later

schriftgestalt · 2023-07-24T11:55:08Z

That should be handle with my pull request that was never excepted.

khaledhosny · 2023-07-24T22:23:26Z

That should be handle with my pull request that was never excepted.

Which one, #818?

If you want to finish it in some reasonable time frame, I’m more than happy to close this one instead, otherwise I’m currently blocked on this issue and need a fix ASAP.

anthrotype · 2023-07-25T09:30:05Z

That should be handle with my pull request that was never excepted.

hey Georg, the reason the other PR wasn't accepted is because it doesn't seem to be in a mergeable state currently; it's got a "WIP" in the title, there's no high level description of what it intends to do, it's got a long/noisy commit log which ideally could be squashed, a bunch of commented out code, leftover debug() statements, something called _applySimpleIndicShaping which is anything but "simple"...
Khaled's PR is smaller and more targeted, so I'd prefer we merge this one first.

schriftgestalt · 2023-07-25T12:20:26Z

It was mergable when opened the pull request.

schriftgestalt · 2023-07-25T12:26:08Z

My PR fixes and improves several areas (names, indo, writing direction …). Adding more changes now will make it even more complicated to merge it.
And that code is needed also for round tripping Glyphs 2 and 3 files.

khaledhosny · 2023-07-29T14:23:34Z

So what is the resolution here? #818 is unmergable as it stands and I’m still blocked.

schriftgestalt · 2023-07-29T14:35:44Z

I'll panned to work on glyphsLib next week. Finishing the Glyphs3 branch and rebasing the info branch.

khaledhosny · 2023-08-16T21:58:56Z

I hate to keep saying this, but this PR have been open for over a tear and half and it is been 3 weeks since I rebased it, and I’m still blocked. I suggest merging this and I’ll try to help with rebasing #818 and solving the conflicts afterwords.

We had them handled in category detection but not in production names. This change generalizes this and adds few more conditions: 1. Don’t add script suffix to ligature part that already have one. 2. Don’t add it if the part had an entry in GlyphData (to handle cases like "brevecomb_Dboldscript-math"). Should fix #770

Since “d” has an entry in GlyphData, we were not adding the script suffix to it. Now we check if both “d” exists and “d-deva” does’t before deciding not to add the suffix.

This way alternative names are always considered. This was supposed to fix names like “sh_r-deva”, but though we now produce uni names, they are still different from Glyphs’.

In glyph names like “po-khmer.below.ro” we were discarding all suffixes while searching for base glyh name, but (surprisingly) “po-khmer.below” has an entry in GlyphData and should be the base glyph name instead of “po-khmer” (yay for consistent naming scheme). Fix this by looking for name with suffixes one after another first.

Handle names like “moMa_underscore-thai” where it should be split into “moMa-thai” and plain “underscore” not “underscore-thai”. I updated the test expectation because Glyphs is just bing silly building production names inconsistently.

These should be fixed, but I think we are missing some Glyphs magic here unrelated to the splitting of the ligature parts.

anthrotype · 2023-08-20T19:22:14Z

Feel free to merge this as soon as you make the CI pass, thanks

Something seems to be wrong with it. Windows CI job is failing with: py: exit 1 (6.58 seconds) D:\a\glyphsLib\glyphsLib> python -I -m pip install -r requirements.txt -r requirements-dev.txt pid=3368 ERROR: THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them. py: FAIL code 1 (11.77 seconds) evaluation failed :( (12.05 seconds) lxml==4.9.1 from https://files.pythonhosted.org/packages/5b/2a/b29ca0616397e6d5608255cd0f635a6786892fec898eb65fe8aa4347e9c0/lxml-4.9.1-cp37-cp37m-win_amd64.whl (from -r requirements-dev.txt (line 49)): Expected sha256 eea5d6443b093e1545ad0210e6cf27f920482bfcf5c77cdc8596aec73523bb7e Got 406f265a44905c7de1e6f3d598d451a56652abf587acab28a4906896da56d434

khaledhosny · 2023-08-20T20:13:30Z

BTW, I tried to rebase #818 on current master, and there are significant conflicts, so I don’t think merging this is going to make it much worse.

khaledhosny force-pushed the issue-770 branch 2 times, most recently from df01a91 to 49fb300 Compare February 1, 2022 14:22

khaledhosny mentioned this pull request Feb 22, 2022

Support Glyphs3 RTL kerning #778

Closed

khaledhosny force-pushed the issue-770 branch from 49fb300 to 5dced5c Compare July 23, 2023 15:49

khaledhosny requested a review from anthrotype July 23, 2023 16:19

anthrotype reviewed Jul 24, 2023

View reviewed changes

khaledhosny and others added 3 commits August 20, 2023 21:58

Add more testcases

21df623

[glyphdata] Handle production names for glyphs like d_ba-deva

42ed171

Since “d” has an entry in GlyphData, we were not adding the script suffix to it. Now we check if both “d” exists and “d-deva” does’t before deciding not to add the suffix.

khaledhosny added 4 commits August 20, 2023 21:58

[glyphdata] Always use _lookup_attributes()

658f11e

This way alternative names are always considered. This was supposed to fix names like “sh_r-deva”, but though we now produce uni names, they are still different from Glyphs’.

[glyphdata] Change a few more test expectations

0618c67

These should be fixed, but I think we are missing some Glyphs magic here unrelated to the splitting of the ligature parts.

khaledhosny force-pushed the issue-770 branch from 78f641f to 0618c67 Compare August 20, 2023 18:58

khaledhosny force-pushed the issue-770 branch 2 times, most recently from d5e00f6 to e9789db Compare August 20, 2023 20:03

khaledhosny force-pushed the issue-770 branch from e9789db to 7a3271b Compare August 20, 2023 20:06

khaledhosny merged commit 273496b into main Aug 20, 2023
9 of 10 checks passed

khaledhosny deleted the issue-770 branch August 20, 2023 20:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[glyphdata] handle production names for ligatures with script suffixes #771

[glyphdata] handle production names for ligatures with script suffixes #771

khaledhosny commented Jan 31, 2022

khaledhosny commented Jan 31, 2022

khaledhosny commented Jan 31, 2022

madig commented Feb 1, 2022

khaledhosny commented Feb 1, 2022

khaledhosny commented Feb 1, 2022

khaledhosny commented Feb 1, 2022

anthrotype commented Apr 6, 2022

schriftgestalt commented Apr 6, 2022

anthrotype commented May 4, 2022

khaledhosny commented Jul 23, 2023

khaledhosny commented Jul 23, 2023

anthrotype commented Jul 24, 2023

anthrotype Jul 24, 2023

khaledhosny Aug 17, 2023

anthrotype commented Jul 24, 2023

schriftgestalt commented Jul 24, 2023

khaledhosny commented Jul 24, 2023

anthrotype commented Jul 25, 2023

schriftgestalt commented Jul 25, 2023

schriftgestalt commented Jul 25, 2023

khaledhosny commented Jul 29, 2023

schriftgestalt commented Jul 29, 2023

khaledhosny commented Aug 16, 2023

anthrotype commented Aug 20, 2023

khaledhosny commented Aug 20, 2023

[glyphdata] handle production names for ligatures with script suffixes #771

[glyphdata] handle production names for ligatures with script suffixes #771

Conversation

khaledhosny commented Jan 31, 2022

khaledhosny commented Jan 31, 2022

khaledhosny commented Jan 31, 2022

madig commented Feb 1, 2022

khaledhosny commented Feb 1, 2022

khaledhosny commented Feb 1, 2022

khaledhosny commented Feb 1, 2022

anthrotype commented Apr 6, 2022

schriftgestalt commented Apr 6, 2022

anthrotype commented May 4, 2022

khaledhosny commented Jul 23, 2023

khaledhosny commented Jul 23, 2023

anthrotype commented Jul 24, 2023

anthrotype Jul 24, 2023

Choose a reason for hiding this comment

khaledhosny Aug 17, 2023

Choose a reason for hiding this comment

anthrotype commented Jul 24, 2023

schriftgestalt commented Jul 24, 2023

khaledhosny commented Jul 24, 2023

anthrotype commented Jul 25, 2023

schriftgestalt commented Jul 25, 2023

schriftgestalt commented Jul 25, 2023

khaledhosny commented Jul 29, 2023

schriftgestalt commented Jul 29, 2023

khaledhosny commented Aug 16, 2023

anthrotype commented Aug 20, 2023

khaledhosny commented Aug 20, 2023