Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add markdown printer #265

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

yasunariw
Copy link

@yasunariw yasunariw commented Mar 10, 2022

See #231.

This PR adds:

  1. A printer that transforms the AST back into Markdown text (Omd.to_markdown)
  2. A CLI flag --mode to use Markdown instead of HTML as the output mode.

The printer that existed in the 1.3.1 version was removed, so this PR is a rewrite for the 2.0 AST. The add_string_escape_chars function is reused from the previous version by @pw374 .

So far, its behavior has been tested against manually constructed examples that cover all parts of the syntax. Please let me know if there is a robust way to add tests to the current setup. Thanks!

@yasunariw yasunariw force-pushed the ahrefs/yasu/render-markdown branch from e60a510 to 5adf485 Compare March 10, 2022 14:15
@patricoferris
Copy link
Contributor

Cool feature @yasunariw !

I worked on something similar and you can see my findings here: #223 (comment). I think the conversion from string -> Omd.doc losses information so it is quite tricky to implement a to_markdown function. For example, there are some corner cases such as:

utop # Omd.of_string "_*hello*_" |> Omd.to_markdown;;
- : string = "**hello**\n"

Note this is after pinning e60a510a54da4c3ed63da34fe6a35f6287570989. Here, it is valid markdown to nest emphasis using the two different characters, but after using the Omd.to_markdown function it has been transformed into strong text.

What might be useful is the (very hacked together) testsuite I was running here: https://github.com/patricoferris/omd/tree/omd-print/tests. The idea was to parse the markdown normally and get the HTML, then use the to_markdown function and parse that output to HTML and compare the two HTML outputs. It is probably not full-proof but it was how I started finding these edge cases.

@yasunariw
Copy link
Author

Thanks for all the pointers @patricoferris ! I'll see what else I catch using your test method and report back.

@shonfeder
Copy link
Collaborator

shonfeder commented May 7, 2022

Thanks for the work on this important feature, @yasunariw and @patricoferris.

Please let me know if there is a robust way to add tests to the current setup. Thanks!

The most robust way I can think of is via some property based testing. If we had generator that constructed valid arbitrary Omd.doc values, then we could check that forall x : Omd.doc, (x |> Omd.to_markdown |> Omd.of_string) = x. This should catch the kind of edge case that @patricoferris has pointed out, iiuc.

My inclination is to suggest that we should expect any mergable solution to this feature to depend on the solution implemented for #223.

Comment on lines +289 to +290
let n = String.length c in
if n > 0 && c.[String.length c - 1] <> '\n' then begin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let n = String.length c in
if n > 0 && c.[String.length c - 1] <> '\n' then begin
let n' = String.length c in
if n' > 0 && c.[n'- 1] <> '\n' then begin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise the nchar n buf bellow adds too much backticks when closing the code block.

samoht added a commit to samoht/toc that referenced this pull request Nov 11, 2022
Need to remove the patches once it is merged and released
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants