Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skipping joiners during GSUB #33

Open
adrianwong opened this issue Oct 2, 2020 · 5 comments
Open

Skipping joiners during GSUB #33

adrianwong opened this issue Oct 2, 2020 · 5 comments

Comments

@adrianwong
Copy link
Member

If a ZWJ is placed between two characters with the intent that the two should ligate, but a font has not provided a lookup that uses the ZWJ, should the ZWJ be skipped in order to allow the ligation?

Currently, Allsorts doesn't skip the ZWJ. Here is an example using Noto Serif:

Screenshot from 2020-10-02 15-06-18

Likewise, should joiners (both ZWJ and ZWNJ) be skipped in backtrack/lookahead sequences?

Allsorts doesn't do this either. Using Noto Serif again, this is a chaining contextual lookup where "i" is the input, and the tilde is the lookahead:

Screenshot from 2020-10-02 16-17-06

HarfBuzz skips the joiners in both cases. Relevant links here and here. Should Allsorts follow suit?

@behdad
Copy link

behdad commented Oct 2, 2020

Obviously I think what we are doing in HarfBuzz is the best way to interpret Unicode with OpenType. I've made all those decisions with @jfkthame.

@adrianwong
Copy link
Member Author

Thanks for your input, @behdad. Skipping ZWJs seems like the sensible thing to do, as one would expect that a ZWJ should not block ligation.

However, can you elaborate on why skipping ZWNJs in backtrack/lookahead sequences is "the right thing to do" and "that backtrack/lookahead should match the actual glyphs; presence or lack thereof joiners should not affect what matches"? I'm referencing your comments in the links I shared, and am curious as to how you (or the both of you) arrived at that conclusion.

Perhaps this behaviour should be documented the OpenType shaping docs, if it isn't documented anywhere else (cc @n8willis).

@mikeday
Copy link
Contributor

mikeday commented Oct 2, 2020

I'm also curious @behdad would you happen to have a good example of how joiners might be used in practice that could interfere with the contextual lookup process?

The example of f + zwj + i seems very reasonable but placing a zwnj between base and mark is more of a contrived example to demonstrate the shaping behaviour; is there a situation where something like this could actually happen in Arabic text?

@behdad
Copy link

behdad commented Oct 2, 2020

Here. A purely demonstrative example:

Imagine you have a rule that ligates C,D but only if preceded by A,B glyphs. That is, A,B,[C,D] is the sequence with A,B part being backtrack.

Now if input is A,ZWNJ,B,C,D, we STILL want the rule to match. That's why we ignore ZWNJ in backtrack/lookahead.

@mikeday
Copy link
Contributor

mikeday commented Oct 3, 2020

Thanks that seems reasonable, perhaps we can scan some fonts for suitable ligatures and find real examples of that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants