upgrade markdown package reference #2364

jezell · 2024-10-04T23:40:45Z

This upgrades the markdown package to 7.2.1, also fixes an issue where encodeHtml is not passed to the markdown parser.

matthew-carroll · 2024-10-05T00:13:13Z

super_editor_markdown/lib/src/image_syntax.dart

@@ -38,7 +38,8 @@ class SuperEditorImageSyntax extends md.LinkSyntax {
    if (parser.pos + 1 >= parser.source.length) {
      // The `]` is at the end of the document, but this may still be a valid
      // shortcut reference link.
-      return _tryCreateReferenceLink(parser, text, getChildren: getChildren);
+      final link = _tryCreateReferenceLink(parser, text, getChildren: getChildren);


Looks like all uses of _tryCreateReferenceLink() gets converted to a list. Should we just change that method to return Iterable<md.Node>? and then continue to return it directly?

Good catch. That would be better.

matthew-carroll · 2024-10-05T00:14:01Z

super_editor_markdown/lib/src/markdown_to_document_parsing.dart

@@ -23,9 +23,12 @@ MutableDocument deserializeMarkdownToDocument(
  List<ElementToNodeConverter> customElementToNodeConverters = const [],
  bool encodeHtml = false,
 }) {
-  final markdownLines = const LineSplitter().convert(markdown);
+  final markdownLines = const LineSplitter().convert(markdown).map((l) {


What happened to require this difference? I'm not sure what the existing code did, but it's not immediately obvious what the difference is in impact between the original code and this new code.

Markdown lib changed from taking List<String> to taking List<Line>.

Can you add an explicit type parameter to your map function so that the difference is unambiguous? Part of the confusion here is that reading the code doesn't tell us what type "l" is.

matthew-carroll · 2024-10-05T00:15:18Z

super_editor_markdown/lib/src/markdown_to_document_parsing.dart

  }
+  static final _tags = [ md.DelimiterTag("u", 1) ];


Please declare all static members above all instance members (we probably need to add a lint for that at some point).

And can you also add a comment above _tags that describe what it is, and how it's used?

This is required by md.DelimiterSyntax, in theory the constructor takes an optional tags member, but a bug in
the Markdown lib replaces that with a const [], and then tries to sort it. Will add a comment to that effect.

matthew-carroll · 2024-10-05T00:16:20Z

super_editor_markdown/lib/src/markdown_to_document_parsing.dart

@@ -548,7 +555,7 @@ class _ParagraphWithAlignmentSyntax extends _EmptyLinePreservingParagraphSyntax

  @override
  bool canParse(md.BlockParser parser) {
-    if (!_alignmentNotationPattern.hasMatch(parser.current)) {
+    if (!_alignmentNotationPattern.hasMatch(parser.current.content)) {


Was this a pre-existing error on our part? Should we have been accessing parser.current.content this whole time?

Markdown lib changed from String to Line which has a content member, so this just redirects to where the string is now.

matthew-carroll · 2024-10-05T00:17:15Z

FYI @angelosilvestre

matthew-carroll · 2024-10-05T00:17:49Z

@jezell it looks like we've got a bunch of failing tests. I'm guessing most/all of them is because we need corresponding changes in other packages.

…ences to match new markdown lib Iterable<md.Node>, add missing override for createNode

jezell · 2024-10-05T04:56:23Z

Looking at the tests, I'm not sure about one of them, "unordered list followed by empty list item" seems to not actually convert in commonmark, so not sure if it's a bad test. Here too in this comment I've put the markdown and it doesn't end up as a single list item and then nothing else. Maybe the new markdown parser is more correct here, but I get three different results with this one from three different parsers, and also different results from github which makes it a hr

https://markdowntohtml.com/
https://daringfireball.net/projects/markdown/dingus
https://spec.commonmark.org/dingus/

list item 1

jezell · 2024-10-05T05:08:47Z

The strikethrough test also looks bad, strikethrough is a double ~ not one, but the test has a single. Github markdown seems to support single, but that doesn't appear to be standard or documented (https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax):

https://www.markdownguide.org/extended-syntax/#:~:text=You%20can%20strikethrough%20words%20by,before%20and%20after%20the%20words.

Seems like maybe the dart markdown package got more strict. Definitely a change in behavior, but most guides seem to confirm it's supposed to be double not single tilde.

jezell · 2024-10-05T05:54:25Z

super_editor_markdown/test/super_editor_markdown_test.dart

@@ -1433,10 +1437,22 @@ with multiple lines
        expect(document.getNodeAt(25)!, isA<ParagraphNode>());
      });

-      test('paragraph with strikethrough', () {
+      test('paragraph with single ~ should not be strikethrough', () {


This is a change in behavior, but it seems to be more correct. The underlying dart markdown parser seems to be more strict now, but it does seem to align better with the generally accepted pattern for strikethrough.

Should the markdown parser accept the decision of the underlying markdown lib to change the behavior here?

I think the new name of this test is ambiguous. When I read it, I thought it meant a single standalone "~~" without a closing "~~". But based on the implementation, I take it that you're trying to describe the difference between "~" and "~~"?

If you're saying that the new package version doesn't parse "~~something~~" as strikethrough, then I think we need to adjust it to do so. Given we've had that behavior for a long time, I don't think we should break it for existing users. That said, users should have an opportunity to turn it off, if they choose.

Adding support for double "~~" should be fine because its a new feature, and if that really is the norm, then I don't foresee complaints.

@matthew-carroll added single line strikethrough parsing node, a bit hacky to get it to work since underlying markdown parser doesn't like having two sort of similar nodes, but does work as long as the order in the list is correct. Added comments to explain.

jezell · 2024-10-05T05:58:59Z

super_editor_markdown/test/super_editor_markdown_test.dart

-      test('unordered list followd by empty list item', () {
-        const markdown = """- list item 1
-    - """;
+      test('unordered list followed by empty list item', () {


There is a change in behavior here in this specific case introduced by the underlying parser changing, but was the intent of the test really to verify that the hanging - was deleted from the line in a weird case that markdown parsers don't seem to agree on? CommonMark spec seems to say that empty list items should be allowed, not stripped. This updates the test to verify that empty list items should be kept like the spec says, rather than test for something that seems to be up interpreted differently depending on the parser being used.

It looks like this is where we introduced this test: #1822

jezell · 2024-10-05T05:59:46Z

super_editor_markdown/test/super_editor_markdown_test.dart

+        expect(styledText.getAllAttributionsAt(7).contains(strikethroughAttribution), false);
+      });
+
+      test('paragraph with double strikethrough', () {


Added a test for this case since it was missing

matthew-carroll · 2024-10-06T19:55:39Z

super_editor_markdown/test/super_editor_markdown_test.dart

@@ -1433,10 +1437,22 @@ with multiple lines
        expect(document.getNodeAt(25)!, isA<ParagraphNode>());
      });

-      test('paragraph with strikethrough', () {
+      test('paragraph with single ~ should not be strikethrough', () {


I think the new name of this test is ambiguous. When I read it, I thought it meant a single standalone "~~" without a closing "~~". But based on the implementation, I take it that you're trying to describe the difference between "~" and "~~"?

If you're saying that the new package version doesn't parse "~~something~~" as strikethrough, then I think we need to adjust it to do so. Given we've had that behavior for a long time, I don't think we should break it for existing users. That said, users should have an opportunity to turn it off, if they choose.

Adding support for double "~~" should be fine because its a new feature, and if that really is the norm, then I don't foresee complaints.

matthew-carroll · 2024-10-06T19:58:45Z

super_editor_markdown/lib/src/markdown_to_document_parsing.dart

+/// This [DelimiterSyntax] produces `Element`s with a `u` tag.
+class UnderlineSyntax extends md.DelimiterSyntax {
+
+  /// This is required by md.DelimiterSyntax, in theory the constructor takes an optional tags member, but a bug in


This comment still doesn't tell us what this thing is. What does "u" mean? What's a DelimiterTag? Why is it required? What happens if it's not passed to the super constructor? You somewhat answered that last part, but didn't mention the consequences.

Assume the next person who's responsible for this has very little understanding of the inside of the markdown package.

Was trying to explain that it's not actually supposed to be required from the class definition:

https://pub.dev/documentation/markdown/latest/markdown/DelimiterSyntax-class.html

The constructor takes a nullable. However, the problem is there is a bug in the underlying dart library if you don't pass it. Due to these two lines, one sets it to const [] if not passed, then the next tries to sort. So we have to pass something or it blows up. That's the only reason it's there.

https://github.com/dart-lang/markdown/blob/d53feae0760a4f0aae5ffdfb12d8e6acccf14b40/lib/src/inline_syntaxes/delimiter_syntax.dart#L67

https://github.com/dart-lang/markdown/blob/d53feae0760a4f0aae5ffdfb12d8e6acccf14b40/lib/src/inline_syntaxes/delimiter_syntax.dart#L319

I understand that point about the bug in the package - but those questions I raised above are still real questions that a reader will have. Can you please address those in the code comment?

Added this explanation to the code, hope it clarifies the reasoning this is required better.

matthew-carroll · 2024-10-06T19:59:45Z

super_editor_markdown/lib/src/markdown_to_document_parsing.dart

@@ -23,9 +23,12 @@ MutableDocument deserializeMarkdownToDocument(
  List<ElementToNodeConverter> customElementToNodeConverters = const [],
  bool encodeHtml = false,
 }) {
-  final markdownLines = const LineSplitter().convert(markdown);
+  final markdownLines = const LineSplitter().convert(markdown).map((l) {


Can you add an explicit type parameter to your map function so that the difference is unambiguous? Part of the confusion here is that reading the code doesn't tell us what type "l" is.

matthew-carroll · 2024-10-06T22:35:22Z

Looks like we have 40 failing tests in super_editor - most/all of them look like auto-conversion of text, which is based on Markdown parsing.

…nging lines after list

jezell · 2024-10-07T01:53:30Z

Looks like we have 40 failing tests in super_editor - most/all of them look like auto-conversion of text, which is based on Markdown parsing.

@matthew-carroll looks like what was happening is the new dart markdown parser drops of extra newlines if there are hanging newlines at the end of the file after a list. There aren't any tests for this in the markdown package, but the tests in the super_editor package relies on it for setting the caret in an empty node after the lists. I'm adding a test for this case and some extra logic in the parser. It's a little dirty, but seems like the desire here is to insulate people from the markdown lib changing. At least the test will catch if they change the behavior back.

matthew-carroll · 2024-10-07T03:26:36Z

super_editor_markdown/test/super_editor_markdown_test.dart

-        const markdown = """- list item 1
-    - """;
+      test('unordered list followed by empty list item', () {
+        const markdown = """


Does this leading newline need to be here? If a Markdown blob begins with a list item, it won't have a leading newline.

Doesn't need to be there, will remove the leading.

matthew-carroll · 2024-10-07T03:27:24Z

super_editor_markdown/test/super_editor_markdown_test.dart

        final doc = deserializeMarkdownToDocument('~This is~ a paragraph.');
        final styledText = (doc.getNodeAt(0)! as ParagraphNode).text;

+        // Ensure text within the range isn't attributed.


Shouldn't this test be reverted, given that you added the single-character parser?

Good catch. Test was reverted but comment wasn't.

matthew-carroll · 2024-10-07T03:28:36Z

super_editor_markdown/test/super_editor_markdown_test.dart

@@ -1541,6 +1557,23 @@ Paragraph3""";
        expect((doc.getNodeAt(2)! as ParagraphNode).text.text, 'Paragraph3');
      });

+       test('empty paragraph after paragraphs', () {


Can you phrase the test name in terms of what you're verifying? I'm actually not sure at first glance which detail you're locking down here. Is it that 2 trailing newlines are reduced to 1? Is it that 2+ trailing newlines are reduced to 1?

updating to 'every 2 newlines after a list are a paragraph'. The logic is a bit weird but it seems to be what the old dart markdown parser was doing with new lines after lists. The super_editor tests that deal with inserting newlines after lists fail with null reference exceptions due to missing nodes if this doesn't happen. I first tried adding code to preserve all hanging newlines regardless of what came between, but other tests in the markdown package want a no newlines after other types of content. Seems like the old parser wasn't really consistent with hanging newlines and that leaked into the tests. So this retains old behavior, but also codifies old inconsistencies.

matthew-carroll · 2024-10-07T03:28:55Z

@angelosilvestre can you also take a look at this PR and see if you find anything that I've missed?

angelosilvestre

LGTM with a single comment.

angelosilvestre · 2024-10-13T22:22:30Z

super_editor_markdown/lib/src/markdown_to_document_parsing.dart

+  // Add 1 hanging line for every 2 blank lines at the end, need this to preserve behavior pre markdown 7.2.1
+  final hangingEmptyLines = markdownLines.reversed.takeWhile((md.Line l) => l.isBlankLine);
+  if(hangingEmptyLines.isNotEmpty && documentNodes.lastOrNull is ListItemNode) {
+    for(var i = 0; i < hangingEmptyLines.length >> 1; i++) {


Won't this be a bit more readable if we use hangingEmptyLines.length ~/ 2 instead of hangingEmptyLines.length >> 1?

I didn't catch that for loop, but yeah, @jezell let's not use bit shifting unless we absolutely have to. If you wanna divide by two, let's use the standard operators for that.

@jezell can you push an update with this change?

upgrade markdown package reference

6050484

matthew-carroll reviewed Oct 5, 2024

View reviewed changes

document _tags in UnderlineSyntax, cleanup _tryCreateInlineLink refer…

9084683

…ences to match new markdown lib Iterable<md.Node>, add missing override for createNode

timu-jesse-ezell added 2 commits October 4, 2024 22:09

fix parser writing node instead of node content

fddd6dc

improve markdown tests to align with common behaviors

95ea5b2

jezell commented Oct 5, 2024

View reviewed changes

matthew-carroll reviewed Oct 6, 2024

View reviewed changes

matthew-carroll mentioned this pull request Oct 6, 2024

Upgrade markdown package to v7.2.0 #1907

Closed

add backwards compat logic, add test to ensure backwards compat on ha…

7578b41

…nging lines after list

improve documentation on _tags

a5ae6c2

matthew-carroll reviewed Oct 7, 2024

View reviewed changes

improve newline tests

a1d18d8

angelosilvestre reviewed Oct 13, 2024

View reviewed changes

upgrade markdown package reference #2364

Are you sure you want to change the base?

upgrade markdown package reference #2364

Conversation

jezell commented Oct 4, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jezell Oct 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthew-carroll commented Oct 5, 2024

matthew-carroll commented Oct 5, 2024

jezell commented Oct 5, 2024 • edited Loading

list item 1

jezell commented Oct 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthew-carroll commented Oct 6, 2024

jezell commented Oct 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthew-carroll commented Oct 7, 2024

angelosilvestre left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jezell Oct 5, 2024 •

edited

Loading

jezell commented Oct 5, 2024 •

edited

Loading

jezell commented Oct 7, 2024 •

edited

Loading