Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faulty parsing of \ in link #139

Open
garazdawi opened this issue Nov 3, 2023 · 6 comments
Open

Faulty parsing of \ in link #139

garazdawi opened this issue Nov 3, 2023 · 6 comments
Assignees
Labels
bug Something isn't working under investigation

Comments

@garazdawi
Copy link
Contributor

It would seem like \ is interpreted incorrect/strange when in a link target. For example

> EarmarkParser.as_ast ~S"[foo](\foo)"
{:ok, [{"p", [], [{"a", [{"href", "foo"}], ["foo"], %{}}, ")"], %{}}], []}

I would expect the \ to be part of the href (possibly uri encoded) and the ) should not be visible anywhere. babelmark seems to agree with this.

I noticed that sending two \\ more does what I expect, but the ) is still part of the result when it should not be:

> EarmarkParser.as_ast ~S"[foo](\\foo)"
{:ok, [{"p", [], [{"a", [{"href", "\\foo"}], ["foo"], %{}}, ")"], %{}}], []

You may ask, why would anyone want to put a \ in a link? When improving support for Erlang in ExDoc I need to be able to create links to the c module. Naturally an mfa link to that would looks like this c:ls/0, which ExDoc interprets as a callback. So I was about to add the possibility to escape the c:, so that it is interpreted as a mfa and the natural escape character for me is [shell ls() function](`\c:ls/0`), when I stumbled on this bug.

@RobertDober
Copy link
Owner

RobertDober commented Nov 3, 2023

Thank you for this report

  1. Earmark has always adapted the policy that \ escapes the next character regardless of if iy needs to be escaped. EarmarkParser has therefore continued this policy.

    As this cannot be changed without braking backwards compatibility this will not be changed soon.

  2. Concerning the remaining ")" I will investigate, however I am not capable of working right now, so sorry if this is a problem for you which I infer from your rationale

    And of course \should be alllowed in links.

@garazdawi
Copy link
Contributor Author

Earmark has always adapted the policy that \ escapes the next character regardless of if iy needs to be escaped. EarmarkParser has therefore continued this policy.

As this cannot be changed without braking backwards compatibility this will not be changed soon.

Is there any documentation describing this? As it is different than all other Markdown implementation I've come across (including Gruber markdown) I think it would be good to have it documented. I don't mind adding it if it does not exist.

As an aside, we have the same behaviour in Erlang but (IIRC) recently started warning for such places where the escape is not necessary and eventually plan to remove it altogether. It will take a couple of years until we can change the behaviour, but I think Erlang will become a tiny tiny bit better for it.

Concerning the remaining ")" I will investigate, however I am not capable of working right now, so sorry if this is a problem for you which I infer from your rationale

And of course \ should be alllowed in links.

The most common usecase for escaping c: is when it is an autolink in ExDoc (that is a bare `\c:ls()`) and in there \ is handled as it should be. Having it is a link target is mostly for consistency as ExDoc allows the same syntax for autolinks and link targets.

If this becomes an real problem I can spend some time and see if I can figure out where the bug is.

@RobertDober
Copy link
Owner

Is there any documentation describing this? As it is different than all other Markdown implementation I've come across (including Gruber markdown) I think it would be good to have it documented. I don't mind adding it if it does not exist.

That would be great indeed

It is of course possible to deprecate \f as not necessary and warning that it will be interpreted differently in 1.5.
I am a little bit surprised that this is the desired behaviour does not seem unix like, and makes things more complicated, but I will follow the consenus happily.

@RobertDober
Copy link
Owner

I'll definitely leave this open and will tackle it as soon (but it will not be soon :( ) as possible.
PR for the doc welcome if possible. ❤️

@garazdawi
Copy link
Contributor Author

I was in the process of writing some docs describing this when I noticed that it seems like it is only within a link target that \ escapes anything.

> EarmarkParser.as_ast(~S"\c")
{:ok, [{"p", [], ["\\c"], %{}}], []}

From your description above I would have assumed it to be escaped everywhere? From my small experiments it would seem that Earmark treats \ as it should in most cases except in link targets (and possibly other places that I have not yet found).

@RobertDober
Copy link
Owner

Sorry for the late answer (triggered by another issue filed by you, ty) I have not noticed that yet.

My appologies for the incorrect answer, it is not my style to cite facts from memory (a common fellony) but I did.

On the bright side the not escaping case you showed above is the correct behavior 😊 .
Also your original idea of what should be seems correct.

I'll try not to forget this issue this time

@RobertDober RobertDober self-assigned this Apr 24, 2024
@RobertDober RobertDober added bug Something isn't working under investigation labels Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working under investigation
Projects
None yet
Development

No branches or pull requests

2 participants