Fix endnotes issue #561

kalaspuffar · 2023-04-27T07:47:56Z

This PR tries to solve the issue of the endnote role not being allowed. This is more described in #556

I made two changes to the rules.

Endnote roles are not used to find backlinks to verify that all backlinks are correct. We will use the endnote epub:type instead.
dc:title metadata tags will be concatenated with string-join using the pattern " - " so we will get spaces and a dash between each part of the title. That will then be checked against each content document title.

As you see, there are many changes, but the main thing here is that I created a test to verify the validity of our test case and found many issues.

Title content document did not match the package
id's was not unique
some page breaks used title instead of aria-label
refereed notes were missing
dc:identifier in the content document did not match package file dc:identifier

I'm not sure if verifying all the test changes is interesting but you might have an input on the minor rule changes I made.

Best regards
Daniel

…tent verification.

josteinaj · 2023-05-02T17:45:52Z

Thanks @kalaspuffar!

I'm only wondering about the dc:title element in the OPF.

From the specification:

Reading Systems MUST recognize the first title element in document order as the main title of the EPUB Publication (i.e., the primary one to present to users). This specification does not define how to process additional title elements.

https://www.w3.org/publishing/epub3/epub-packages.html#sec-opf-dctitle

They use this example:

    <dc:title>THE LORD OF THE RINGS</dc:title>
    <dc:title>Part One: The Fellowship of the Ring</dc:title>

I'm wondering if we should keep them separated also in HTML? For instance like this:

    <title>THE LORD OF THE RINGS</title>
    <meta name="dc:title">Part One: The Fellowship of the Ring</meta>

So: the first OPF <dc:title> maps to HTML <title>, and all the following OFP <dc:title> maps to HTML <meta name="dc:title">.

What do you think @martinpub @AndersEkl @kalaspuffar?

In the future we could also consider putting the name of the chapter (or whatever is the first structural item in the document) as <title> (and then just mapping all OPF <dc:title> to HTML <meta name="dc:title">. Because there are some accessibility concerns:

A common navigation technique for users of assistive technology is to read the page title and infer the content the page contains. This is because navigating into a page to determine its content can be a time-consuming and potentially confusing process. Titles should be unique to every page of a website, ideally surfacing the primary purpose of the page first, followed by the name of the website. Following this pattern will help ensure that the primary purpose of the page is announced by a screen reader first. This provides a far better experience than having to listen to the name of a website before the unique page title, for every page a user navigates to in the same website.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/title#accessibility_concerns

kalaspuffar · 2023-05-03T06:07:30Z

Hi @josteinaj

Yes, I think this could be an improvement for the future. Looking at the current specification:

2.5.2 Title
The <title> element of every xhtml content document must match the dc:title metadata of the package file.

The information is sparse, so the implementation in this PR should at least follow that directive. But creating an issue in this repository or future development for the format repository could be good. But without a change to the specification, I think we should keep the implementation so it follows the specification currently written.

It could be interpreted a couple of ways, though. Either we do as this PR, or we just join them with a single space, or we just pick the first dc:title element to match with the other documents.

Best regards
Daniel

josteinaj · 2023-05-03T06:32:00Z

In the EPUB specification with the multi-dc:title example, they join with a comma. In other cases it might make more sense with a colon or a hyphen. What we choose should be specified to the producers, but it is not described in the nordic specification. So I think it would make the most sense to just use the first dc:title, instead of joining them. But I think @martinpub needs to decide.

karladamt · 2023-05-05T13:50:06Z

Martin has recently changed position here at MTM, so I don't know if he will do this anymore. Until it is settled here who will continue this work, MTM through me decides that we follows Josteins suggestion and only use the first dc:title.

kalaspuffar · 2023-05-08T06:00:16Z

Hi @josteinaj

I've now changed the code to only validate the first (main) title against the content documents.

Best regards
Daniel

martinpub · 2023-05-08T06:34:47Z

Martin has recently changed position here at MTM, so I don't know if he will do this anymore. Until it is settled here who will continue this work, MTM through me decides that we follows Josteins suggestion and only use the first dc:title.

Hi, just wanted to say I'm still monitoring this repo, just haven't had time yet to reply. I hope to be involved in further guidelines work, but perhaps not as much when it comes to implementing validation rules. However, feel free to ask/invoke me in discussion if you feel like it. I will be happy to share my thoughts.

Regarding the specific issue at hand, I think the current one suggested by @josteinaj and implemented by @kalaspuffar is a reasonable one. We should note @josteinaj's remarks for guidelines revision work.

josteinaj · 2023-05-08T12:01:47Z

src/main/resources/xml/schema/2020-1/nordic2020-1.opf-and-html.sch

@@ -170,7 +170,7 @@
        <p>The HTML title element must be the same as the OPF publication dc:title</p>
        <rule context="html:title">
            <let name="context" value="concat('(&lt;', name(), string-join(for $a in (@*) return concat(' ', $a/name(), '=&quot;', $a, '&quot;'), ''), '&gt;)')"/>
-            <let name="fulltitle" value="string-join(/*/opf:package/opf:metadata/dc:title[not(@refines)]/text(), ' - ')"/>
+            <let name="fulltitle" value="//opf:package/opf:metadata/dc:title[1]/text()"/>


I think you still should have the not(@refines). It would be rare that the first dc:title has a refines-attribute, but I think we should still check for it. So dc:title[not(@refines) and position()=1].

Hi @josteinaj

Seems reasonable; I agree that it's unlikely but for completeness.

Changed the rule to the check you suggested.

Best regards
Daniel

kalaspuffar added 2 commits April 26, 2023 21:13

Initial commit.

b043251

Fixing testcase to conform to the 2020-1 guideline for package to con…

5bc4044

…tent verification.

kalaspuffar requested review from josteinaj and martinpub April 27, 2023 07:47

kalaspuffar added 2 commits May 8, 2023 07:52

Only validate the first main title against other content documents.

4b9ab39

Small indent fix.

5020657

kalaspuffar removed the request for review from martinpub May 8, 2023 05:59

josteinaj requested changes May 8, 2023

View reviewed changes

kalaspuffar added 2 commits May 8, 2023 14:35

Changed to check for refined as well.

27aff4f

Get package file.

927ab80

josteinaj approved these changes May 9, 2023

View reviewed changes

josteinaj merged commit 41042cc into nlbdev:master May 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix endnotes issue #561

Fix endnotes issue #561

kalaspuffar commented Apr 27, 2023

josteinaj commented May 2, 2023

kalaspuffar commented May 3, 2023

josteinaj commented May 3, 2023

karladamt commented May 5, 2023

kalaspuffar commented May 8, 2023

martinpub commented May 8, 2023

josteinaj May 8, 2023

kalaspuffar May 8, 2023

Fix endnotes issue #561

Fix endnotes issue #561

Conversation

kalaspuffar commented Apr 27, 2023

josteinaj commented May 2, 2023

kalaspuffar commented May 3, 2023

josteinaj commented May 3, 2023

karladamt commented May 5, 2023

kalaspuffar commented May 8, 2023

martinpub commented May 8, 2023

josteinaj May 8, 2023

Choose a reason for hiding this comment

kalaspuffar May 8, 2023

Choose a reason for hiding this comment