Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✅ Add more tests for the behaviour of rich_markup_mode #964

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

svlandeg
Copy link
Member

@svlandeg svlandeg commented Sep 2, 2024

There's a few open PRs around markdown formatting and parsing that I want to review, but first I would like to establish what exactly is the desired outcome for different docstrings and different settings of rich_markup_mode.

This PR is supposed to capture the ideal output for each test, using a pytest.mark.xfail marker for those cases that currently fail on master. Any potential fixes will then be contributed in subsequent PRs.

Comment on lines +46 to +54
@pytest.mark.parametrize(
"mode,lines",
[
("markdown", ["First line", "Line 1 Line 2 Line 3", ""]),
("rich", ["First line", "Line 1", "", "Line 2", "", "Line 3", ""]),
("none", ["First line", "Line 1", "Line 2", "Line 3", ""]),
],
)
def test_markup_mode_newline_pr815(mode: str, lines: List[str]):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the question is mostly about the newlines: should there be one after "First line" or not?

Further, should the markdown parser output "Line 1 Line 2 Line 3" as one string, or split them into multiple strings, similar to the current behaviour for the option none?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! In all 3 modes the lines should be:

  • First line
  • new line
  • Line 1
  • new line
  • Line 2
  • new line
  • Line 3
  • new line
["First line", "", "Line 1", "", "Line 2", "", "Line 3", ""]

I just tested with pure Rich:

from rich.console import Console
from rich.markdown import Markdown

console = Console()

content = """
First line

Line 1

Line 2

Line 3
"""

md = Markdown(content)


print("--- Markdown ---")

console.print(md)

print("--- Rich markup ---")

console.print(content, markup=True)

print("--- plain ---")

print(content)

And the output is almost always the same:

$ python demo.py
--- Markdown ---
First line                                                                                                                                                                                                               

Line 1                                                                                                                                                                                                                   

Line 2                                                                                                                                                                                                                   

Line 3                                                                                                                                                                                                                   
--- Rich markup ---

First line

Line 1

Line 2

Line 3

--- plain ---

First line

Line 1

Line 2

Line 3

The main difference at least in this script is that for Markdown Rich doesn't add an extra line on top and below. But the main question we have right now are lines in between, and in all cases those are printed consistently.

So yep, that's a bug here in Typer.

),
],
)
def test_markup_mode_newline_issue447(mode: str, lines: List[str]):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the splitting of the strings is different for all 3 options, with rich keep each input line as such, markdown merging everything that is not header, and none keeping all double newlines as separators.

Copy link
Member

@tiangolo tiangolo Nov 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, I think this starts at the design in Click, I understand that it automatically unwraps single line breaks (single new line chars surrounded by regular text) probably just because there's the convention of keeping code lines shorter than some size, e.g. 88 characters. And that if a docstring goes beyond that, a new line with text continuing right below, would be just formatting, but it all would belong in the same paragraph.

And Markdown has the same behavior in some renderers (e.g. the preview in VS Code), I imagine it's also to have in mind those conventions of keeping files with lines under 88 characters. ...although it doesn't have the same behavior here in the GitHub comment preview 😅.

The current code (that was a copied from from click-rich) is probably trying to replicate that behavior while trying to keep the specified format.


To keep consistency with plain Click, and those conventions, I think it makes sense to automatically unwrap single line breaks into a single paragraph (line), and separate those paragraphs with double new lines (so, an empty string in between those lines), so there's a space between them.

So, for rich mode, we would need to manually unwrap the single line breaks that are currently preserved.

For markdown mode, we should keep unwrapping single line breaks, but we should split double line breaks as different paragraphs, and when rendering, separate those with new lines.

For None mode (which is just Click's default), I think it would make sense to keep unwrapping single line breaks, but to split paragraphs by double line breaks (so, an extra empty string in between).

I think that would make the 3 modes more or less consistent.

Note: all this idea of unwrapping is challenged by the test and comments below, about lists.

Comment on lines +152 to +156
pytest.param(
"markdown",
["First line", "", "• 1", "• 2", "• 3", ""],
marks=pytest.mark.xfail(),
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test currently fails on master, with the output being

['First line', '', '• 1 - 2 - 3', '']

-> this happens mostly because the newlines are not parsed correctly, after which the bullet points aren't recognized correctly. PR #815 attempts to fix this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum, this gets very tricky. 🤔

Because I see how I would intuitively expect the lines to be split, visually it's just a list.

But then, that would go against the automatic text unwrapping that Click would do and that I was mentioning above. 🤔

One option is to expect it all to be unwrapped, so that if a user wants a list they would separate the items by new lines.

The other option is to challenge the first idea of automatically unwrapping new lines into a single paragraphs, and instead render things as-is, even if they have internal splits.

A third option I wouldn't want is adding a special case for lists, but that would open many edge cases and many other possible edge cases to handle, so I think I wouldn't like this idea, maybe better to decide on one format or the other.

Comment on lines +191 to +195
pytest.param(
"markdown",
["First line", "", "• 1", "• 2", "• a", "• b", "• 3", ""],
marks=pytest.mark.xfail(),
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test currently fails on master, with the output being

['First line', '', '• 1 - 2   - a   - b - 3', '']

-> this happens mostly because the newlines are not parsed correctly, after which the bullet points aren't recognized correctly. PR #815 attempts to fix this.

Copy link
Member

@tiangolo tiangolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea to have more tests and a more established and expected behavior!

I added some comments in your comments, thinking about the best way to proceed with unwrapping lines or not.

Now I was just reading the docstrings PEP: https://peps.python.org/pep-0257/, it mentions that the first line should be a "summary", followed by a blank line, and that it should fit on a single line.

So I guess that for the first line it's up to us.

The docstrings PEP doesn't say anything about wrapping long lines or not, or how to consider it.

But PEP 8 (https://peps.python.org/pep-0008/#maximum-line-length) does, and explicitly mentions docstrings:

[...] it is okay to increase the line length limit up to 99 characters, provided that comments and docstrings are still wrapped at 72 characters.

So I guess it's kind of expected that people would wrap docstring content with internal new lines without signifying that it means new paragraphs.

So, although I don't like it too much, I'm inclined to not support lists separated by a single line and require them to have blank lines in between.

@svlandeg svlandeg assigned svlandeg and unassigned tiangolo Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal repo / tests Involving the CI / test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants