Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTML title, lets you add <i> and <b> and a couple other tags to titles. #166

Closed
wants to merge 3 commits into from

Conversation

rosiel
Copy link
Contributor

@rosiel rosiel commented Aug 13, 2024

This PR adds the HTML title module. This allows you to put a short list of markup in your node titles.

Allowed Tags

<br><sup><sub><i><b>

Why <i> not <em>?

From the HTML reference:

The i element represents a span of text offset from its surrounding content without conveying any extra emphasis or importance, and for which the conventional typographic presentation is italic text; for example, a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, or a ship name.

My main use case for this improvement is taxonomic names, a frequent occurrence in theses.

Why <b> not <strong>?

Also from the HTML reference:

The b element represents a span of text offset from its surrounding content without conveying any extra emphasis or importance, and for which the conventional typographic presentation is bold text; for example, keywords in a document abstract, or product names in a review.

Reviews of books are commonly held in repositories.

Why <sup> and <sub>?

Math and chemistry markup. However, we're not going as far as adding mathjax to titles (that'll be a recipe, since it's a little more niche).

Why <br>?

I'm not sure, it came enabled. I think it's the most problematic one as the content editor has to remember to put a space on one side of the tag, or else when it gets stripped (as it does in several places, such as the tab title) you'll have words abutted against each other. It also does a poor job of signalling a subtitle since the font treatment is still h1. I'd be happy to take it out.

How it works across the repository

Screenshot 2024-08-13 at 9 50 13 AM

Node page

Title displays in h1 with all its marked up glory, rendered nicely.

Screenshot 2024-08-13 at 9 50 28 AM

Views

Content views: Title displays with html rendered nicely. Unless you strip it out with "Strip HTML tags".
Screenshot 2024-08-13 at 9 50 50 AM

Search API views: The markup is stored in Solr. With the default "plain text" field render formatter, the raw markup is visible to the user and "Strip HTML tags" does not work. With the alternate "HTML title text" field render formatter, the markup renders nicely in the output, encodes into XML properly (see OAI-PMH) and "Strip HTML tags" works.

JSONLD

Markup is raw, unescaped tags in JSONLD. I think this is okay, as the JSON-LD spec only mentions escaping HTML entities in a section specifically about embedding JSONLD into a <script> tag in HTML. Outside the HTML context, unescaped entities should be fine.

Screenshot 2024-08-13 at 10 51 12 AM

OAI-PMH

DC: The HTML is present in the XML. In a browser you see <i> and in a text editor you see &lt;i&gt;. This is correct for XML. I can't find information on whether it's ok for dublin core. PKP does not allow title italics; Omeka does. I assume it's fine. However, I think we would need to add a "strip tags" feature in code if we wanted to remove tags here.

MODS: With "plain text" as the field formatter, the HTML is doubly encoded into the XML. In a browser you see &lt; and in the XML in a text editor you see &amp;lt;. No good!

Screenshot 2024-08-13 at 10 37 31 AM

This assumes harvesters can accept HTML in titles.

Otherwise, we can set the field formatter to "HTML title text" AND "Strip HTML tags" to have an HTML-less OAI experience.

Sorting

You'll probably have a bad time if you're trying to sort on title and you have some that start with markup.

@rosiel
Copy link
Contributor Author

rosiel commented Aug 14, 2024

I made this an omnibus PR. Now this also:

@rosiel
Copy link
Contributor Author

rosiel commented Aug 26, 2024

We've split out this into #168 and #169 . I'll make a separate PR for HTML title field rather than try to tease apart this PR.

@rosiel rosiel closed this Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant