Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the idDelta arithmetic formula to the spec #980

Closed
Pomax opened this issue Oct 11, 2022 · 7 comments
Closed

Add the idDelta arithmetic formula to the spec #980

Pomax opened this issue Oct 11, 2022 · 7 comments
Assignees

Comments

@Pomax
Copy link

Pomax commented Oct 11, 2022

Based on HinTak/Font-Validator#72 it is clear that the text concerning idDelta arithmetic in the Format 2 section may be ambiguous. To remove that ambiguity, it would be good to explicitly add the arithmetic formula to the section. I suggest updating the text from:

Finally, if the value obtained from the subarray is not 0 (which indicates the missing glyph), you should add idDelta to it in order to get the glyphIndex. The value idDelta permits the same subarray to be used for several different subheaders. The idDelta arithmetic is modulo 65536.

to

Finally, if the value obtained from the subarray is not 0 (which indicates the missing glyph), you should add idDelta to it in order to get the glyphIndex. The value idDelta permits the same subarray to be used for several different subheaders. The idDelta arithmetic is modulo 65536. In C, the following function can be used to calculate the glyph ID:

uint16
CalcGlyphID(uint16 code, int16 idDelta)
{
return (code + idDelta) % 0xFFFF;
}

(using phrasing similar to https://learn.microsoft.com/en-us/typography/opentype/spec/otff#calculating-checksums)

And then with a similar update to the subsequent Format 4 section to show a C function after

The value c is the character code in question, and i is the segment index in which c appears. If the value obtained from the indexing operation is not 0 (which indicates missingGlyph), idDelta[i] is added to it to get the glyph index. The idDelta arithmetic is modulo 65536.

If the idRangeOffset is 0, the idDelta value is added directly to the character code offset (i.e. idDelta[i] + c) to get the corresponding glyph index. Again, the idDelta arithmetic is modulo 65536.


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

@PeterCon
Copy link
Collaborator

The proposed snippet could be helpful to add... if it were completely unambiguous to readers what was implied.

It turns out this would have been made easier if the type for idDelta were kept as unsigned as it was in the original TrueType spec (and a bit of explanation for the examples that use -ve values): in the C/C++ language specification, the "overflow" behaviour is specified for unsigned types, but is implementation-dependent for signed types.

The suggested snippet does actually work as intended, but the reasons why it works may not be obvious: IIUC, both operands are "promoted" to an int, which is wide enough for either value (details here); then the modulo ensures the result fits in a uint16. I need to figure out the best way to explain that won't require someone digesting the (pretty dense) language specifications (or asking a compiler developer to explain---what I did).

@PeterCon
Copy link
Collaborator

... then the modulo ensures the result fits in a uint16.

It occurred to me that's not so obvious if idDelta is < 0. And it turns out that's not an accurate assumption. Referring to Kernighan & Ritchie 2nd edn,

The direction of truncation for / and the sign of the result for % are machine-dependent for negative operands, as is the action taken on overflow or underflow.

And, IIUC from the C++ language spec, the result of the modulo operation could be negative. (Both operands are "promoted" to int, then remainder is taken from division of the promoted operands.)

All that to say, I think it will take more than just adding that C snippet to make the intent unambiguously clear.

@behdad
Copy link

behdad commented Dec 10, 2022

Similar issue exists with GSUB SingleSubstFormat1 that currently has:

int16 | deltaGlyphID

@PeterConstable
Copy link

The cmap and GSUB specs could be revised to use uint16, and then add wording to explain the example with negative deltas. But it might be easier simply to clarify the "addition is modulo 65536" statement by adding, "if the result after adding the delta is negative, add 65536 to yield a non-negative glyph ID".

@PeterCon
Copy link
Collaborator

Wrt SingleSubstFormat1.deltaGlyphID, that was always int16, since TTO. The remark in the spec about using modulo 65536 arithmetic was introduced in OT183, and I haven't found what gave rise to that.

@PeterCon
Copy link
Collaborator

I didn't add the C snippet suggested as that doesn't eliminate ambiguities without digesting arcane details of C language specification, but I did add a clarifying statement as suggested two days ago. I made that change for cmap formats 2 and 4 and also for GSUB SingleSubstFormat1.

See OT 1.9.1 alpha for draft revisions addressing this issue.

@PeterCon
Copy link
Collaborator

This is fixed in OT 1.9.1. See the OT 1.9.1 beta for draft revisions addressing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants