Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cache key collision for fonts with bad TTF metadata #1384

Closed

Conversation

jakubdatawrapper
Copy link
Contributor

@jakubdatawrapper jakubdatawrapper commented Aug 5, 2022

Steps to reproduce

See the attached test file.

In general:

  1. Have two fonts with bad fullname TTF metadata. For example these two both have fullname "-":

    $ fc-query NeueHaasGrotesk-Regular.otf
    Pattern has 26 elts (size 32)
    	family: "NeueHaasGrotesk-Regular"(s)
    	familylang: "en"(s)
    	style: "Regular"(s)
    	stylelang: "en"(s)
        	fullname: "-"(s)
        	fullnamelang: "en"(s)
        	slant: 0(i)(s)
        	weight: 80(f)(s)
    	...
    $ fc-query NeueHaasGrotesk-Medium.otf 
    Pattern has 26 elts (size 32)
    	family: "NeueHaasGrotesk-Medium"(s)
    	familylang: "en"(s)
    	style: "Regular"(s)
    	stylelang: "en"(s)
    	fullname: "-"(s)
    	fullnamelang: "en"(s)
    	slant: 0(i)(s)
    	weight: 100(f)(s)
    
  2. Use the fonts in pdfkit:

    doc.registerFont('My Regular Font', fs.readFileSync('./NeueHaasGrotesk-Regular.otf'));
    doc.registerFont('My Medium Font', fs.readFileSync('./NeueHaasGrotesk-Medium.otf'));
    
    doc.font('My Regular Font');
    doc.text('Regular');
    
    doc.font('My Medium Font');
    doc.text('Medium');
    
    doc.font('My Regular Font');
    doc.text('Regular');
  3. See that all the text is rendered with the first font:

    2022-08-05_11-28-17_1090x820

  4. Expected:

    2022-08-05_11-28-40_1085x818

Problem

The problem is that doc.font() uses the TTF fullname as the key in _fontFamilies. And because both fonts have the same fullname, a cache key conflict happens and doc.font() will always load the first font from the _fontFamilies cache, even when the second font is requested.

// check for existing font familes with the same name already in the PDF
// useful if the font was passed as a buffer
if ((font = this._fontFamilies[this._font.name])) {
this._font = font;
return this;
}

Solution

The solution is to not use TTF fullname as the cache key, when there is another name available -- such as when the font has been registered.

So this solution to the bad fullname problem works only when the fonts are registered in advance. If we wanted to make it work without pre-registering, then we would need to somehow detect what a bad fullname is (e.g. check that fullname !== '-'), but that feels fragile. So I think requiring font registration in this case is acceptable.

Refactoring

I've removed the following repeated setting of _fontFamiliies to set it only once. I couldn't think of a reason why it would need to be done twice with different cache keys:

// save the font for reuse later
if (cacheKey) {
this._fontFamilies[cacheKey] = this._font;
}
if (this._font.name) {
this._fontFamilies[this._font.name] = this._font;
}

Then since there is only one cache key now, I've renamed it to name.

Unit tests

I've added some very simple unit tests. They fail like this before the change:

 FAIL  tests/unit/document.spec.js
  ● PDFDocument › FontsMixin › font › saves a registered font to _fontFamilies and reuses the same font object

    expect(received).toEqual(expected) // deep equality

    - Expected  - 0
    + Received  + 1

      Array [
        "Helvetica",
        "My Roboto",
    +   "Roboto-Regular",
      ]

      83 |         const fontObj2 = doc._fontFamilies['My Roboto'];
      84 | 
    > 85 |         expect(Object.keys(doc._fontFamilies)).toEqual([
         |                                                ^
      86 |           'Helvetica',
      87 |           'My Roboto'
      88 |         ]);

Checklist

  • Unit Tests
  • Documentation
  • Update CHANGELOG.md
  • Ready to be merged

@jakubdatawrapper
Copy link
Contributor Author

jakubdatawrapper commented Aug 5, 2022

The test file:

const PDFDocument = require('pdfkit');
const fs = require('fs');
const fetch = require('node-fetch');

const urlRegular =
    'https://use.typekit.net/af/2807c7/00000000000000007735bb48/30/a?primer=7cdcb44be4a7db8877ffa5c0007b8dd865b3bbc383831fe2ea177f62257a9191&fvd=n4&v=3';
const urlMedium =
    'https://use.typekit.net/af/153042/00000000000000007735bb62/30/a?subset_id=2&fvd=n6&v=3';

const doc = new PDFDocument({ autoFirstPage: false });

doc.pipe(fs.createWriteStream('test_fullname_conflict.pdf'));

doc.addPage({ size: [400, 300] });
doc.fontSize(24);

Promise.all([
    fetch(urlRegular).then(res => res.arrayBuffer()),
    fetch(urlMedium).then(res => res.arrayBuffer())
]).then(([fontRegular, fontMedium]) => {
    doc.registerFont('My Regular Font', fontRegular);
    doc.registerFont('My Medium Font', fontMedium);

    doc.font('My Regular Font');
    doc.text('Regular');

    doc.font('My Medium Font');
    doc.text('Medium');

    doc.font('My Regular Font');
    doc.text('Regular');

    doc.end();
});

@jakubdatawrapper
Copy link
Contributor Author

@blikblum Would you (or another maintainer) have time to look into this PR? Do you need more information from me or should I make some changes to the PR?

@jakubdatawrapper jakubdatawrapper force-pushed the fix/font-reg-name-overwrite branch from 78a266e to 4ea2b41 Compare February 15, 2023 18:14
liborm85 added a commit that referenced this pull request Dec 27, 2024
@liborm85
Copy link
Collaborator

@jakubdatawrapper Font cache collision fixed by commit 42172c5.

Now, if the font name matches, the information/metadata attributes are also checked (there will probably always be some difference between for different fonts).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants