Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix endnotes issue #561

Merged
merged 6 commits into from
May 9, 2023
Merged

Fix endnotes issue #561

merged 6 commits into from
May 9, 2023

Conversation

kalaspuffar
Copy link
Collaborator

Hi @josteinaj and @martinpub

This PR tries to solve the issue of the endnote role not being allowed. This is more described in #556

I made two changes to the rules.

  • Endnote roles are not used to find backlinks to verify that all backlinks are correct. We will use the endnote epub:type instead.
  • dc:title metadata tags will be concatenated with string-join using the pattern " - " so we will get spaces and a dash between each part of the title. That will then be checked against each content document title.

As you see, there are many changes, but the main thing here is that I created a test to verify the validity of our test case and found many issues.

  • Title content document did not match the package
  • id's was not unique
  • some page breaks used title instead of aria-label
  • refereed notes were missing
  • dc:identifier in the content document did not match package file dc:identifier

I'm not sure if verifying all the test changes is interesting but you might have an input on the minor rule changes I made.

Best regards
Daniel

@josteinaj
Copy link
Member

Thanks @kalaspuffar!

I'm only wondering about the dc:title element in the OPF.

From the specification:

Reading Systems MUST recognize the first title element in document order as the main title of the EPUB Publication (i.e., the primary one to present to users). This specification does not define how to process additional title elements.

https://www.w3.org/publishing/epub3/epub-packages.html#sec-opf-dctitle

They use this example:

    <dc:title>THE LORD OF THE RINGS</dc:title>
    <dc:title>Part One: The Fellowship of the Ring</dc:title>

I'm wondering if we should keep them separated also in HTML? For instance like this:

    <title>THE LORD OF THE RINGS</title>
    <meta name="dc:title">Part One: The Fellowship of the Ring</meta>

So: the first OPF <dc:title> maps to HTML <title>, and all the following OFP <dc:title> maps to HTML <meta name="dc:title">.

What do you think @martinpub @AndersEkl @kalaspuffar?

In the future we could also consider putting the name of the chapter (or whatever is the first structural item in the document) as <title> (and then just mapping all OPF <dc:title> to HTML <meta name="dc:title">. Because there are some accessibility concerns:

A common navigation technique for users of assistive technology is to read the page title and infer the content the page contains. This is because navigating into a page to determine its content can be a time-consuming and potentially confusing process. Titles should be unique to every page of a website, ideally surfacing the primary purpose of the page first, followed by the name of the website. Following this pattern will help ensure that the primary purpose of the page is announced by a screen reader first. This provides a far better experience than having to listen to the name of a website before the unique page title, for every page a user navigates to in the same website.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/title#accessibility_concerns

@kalaspuffar
Copy link
Collaborator Author

Hi @josteinaj

Yes, I think this could be an improvement for the future. Looking at the current specification:

2.5.2 Title
The <title> element of every xhtml content document must match the dc:title metadata of the package file.

The information is sparse, so the implementation in this PR should at least follow that directive. But creating an issue in this repository or future development for the format repository could be good. But without a change to the specification, I think we should keep the implementation so it follows the specification currently written.

It could be interpreted a couple of ways, though. Either we do as this PR, or we just join them with a single space, or we just pick the first dc:title element to match with the other documents.

Best regards
Daniel

@josteinaj
Copy link
Member

In the EPUB specification with the multi-dc:title example, they join with a comma. In other cases it might make more sense with a colon or a hyphen. What we choose should be specified to the producers, but it is not described in the nordic specification. So I think it would make the most sense to just use the first dc:title, instead of joining them. But I think @martinpub needs to decide.

@karladamt
Copy link

Martin has recently changed position here at MTM, so I don't know if he will do this anymore. Until it is settled here who will continue this work, MTM through me decides that we follows Josteins suggestion and only use the first dc:title.

@kalaspuffar kalaspuffar removed the request for review from martinpub May 8, 2023 05:59
@kalaspuffar
Copy link
Collaborator Author

Hi @josteinaj

I've now changed the code to only validate the first (main) title against the content documents.

Best regards
Daniel

@martinpub
Copy link
Collaborator

Martin has recently changed position here at MTM, so I don't know if he will do this anymore. Until it is settled here who will continue this work, MTM through me decides that we follows Josteins suggestion and only use the first dc:title.

Hi, just wanted to say I'm still monitoring this repo, just haven't had time yet to reply. I hope to be involved in further guidelines work, but perhaps not as much when it comes to implementing validation rules. However, feel free to ask/invoke me in discussion if you feel like it. I will be happy to share my thoughts.

Regarding the specific issue at hand, I think the current one suggested by @josteinaj and implemented by @kalaspuffar is a reasonable one. We should note @josteinaj's remarks for guidelines revision work.

@@ -170,7 +170,7 @@
<p>The HTML title element must be the same as the OPF publication dc:title</p>
<rule context="html:title">
<let name="context" value="concat('(&lt;', name(), string-join(for $a in (@*) return concat(' ', $a/name(), '=&quot;', $a, '&quot;'), ''), '&gt;)')"/>
<let name="fulltitle" value="string-join(/*/opf:package/opf:metadata/dc:title[not(@refines)]/text(), ' - ')"/>
<let name="fulltitle" value="//opf:package/opf:metadata/dc:title[1]/text()"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you still should have the not(@refines). It would be rare that the first dc:title has a refines-attribute, but I think we should still check for it. So dc:title[not(@refines) and position()=1].

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @josteinaj

Seems reasonable; I agree that it's unlikely but for completeness.

Changed the rule to the check you suggested.

Best regards
Daniel

@josteinaj josteinaj merged commit 41042cc into nlbdev:master May 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants