Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try out alternate approach to article text output #329

Closed
wants to merge 2 commits into from

Conversation

rlskoeser
Copy link
Contributor

@rlskoeser rlskoeser commented Sep 5, 2023

ref #316

Preliminary work to explore an alternate approach for generating the plain-text version of articles.

I've added .txt templates for our figure and pull quote shortcodes, and updated the Underwood article to use them. As you can see, we no longer have to duplicate the plain text version of the figure in the article source, we can generate it from the template.

If we go with this approach we'd have to rewrite or add regexes to run against html instead of markdown (unless we can find a better approach than that). From my experimentation, I think that this approach might also solve the character bug in #303

Please review the new text templates, the corresponding changes to the Underwood article, and the text version of that article on the Render PR site and provide feedback on whether you think this direction is worth pursuing.

@render
Copy link

render bot commented Sep 5, 2023

@rlskoeser rlskoeser marked this pull request as draft September 5, 2023 20:48
@rlskoeser rlskoeser requested a review from gwijthoff September 5, 2023 20:59
Copy link
Contributor

@gwijthoff gwijthoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brilliant solution to replacing those janky regexes! I love having the pull quotes in the .txt output. And that would be great if this approach solves the #303 character bug. When you say "rewrite or add regexes to run against html instead of markdown," do you mean find a way to translate the <i>html<i> to *markdown* output in the .txt file? We could strip out all html markup, but I feel like it's important to maintain the markdown in the .txt since it's both human readable and machine executable, and keeps intact all of the authorial intent in the piece.

@rlskoeser
Copy link
Contributor Author

@gwijthoff yes, was thinking we'd need to write new regexes to convert the html to whatever we want the text to look like — similar to what we do now, converting markdown footnotes into [NOTE 2]. I wish there was a better option than regexes, but at least maybe text versions of shortcodes will improve things some.

@gwijthoff
Copy link
Contributor

@rlskoeser I agree that this would be a marginal improvement over our current TXT output and is worth pursuing. Not a must-have for Issue 4 in case your capacity is tight next week, but it would definitely be great.

@gwijthoff gwijthoff added this to the Issue 4 milestone Sep 12, 2023
@rlskoeser rlskoeser removed this from the Issue 4 milestone Sep 18, 2023
@rlskoeser
Copy link
Contributor Author

closing this PR for now since we don't have capacity to investigate further; will keep the branch and the issue in the icebox to track the initial work

@rlskoeser rlskoeser closed this Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants