-
-
Notifications
You must be signed in to change notification settings - Fork 682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✅ Add more tests for the behaviour of rich_markup_mode
#964
base: master
Are you sure you want to change the base?
Conversation
@pytest.mark.parametrize( | ||
"mode,lines", | ||
[ | ||
("markdown", ["First line", "Line 1 Line 2 Line 3", ""]), | ||
("rich", ["First line", "Line 1", "", "Line 2", "", "Line 3", ""]), | ||
("none", ["First line", "Line 1", "Line 2", "Line 3", ""]), | ||
], | ||
) | ||
def test_markup_mode_newline_pr815(mode: str, lines: List[str]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the question is mostly about the newlines: should there be one after "First line" or not?
Further, should the markdown
parser output "Line 1 Line 2 Line 3" as one string, or split them into multiple strings, similar to the current behaviour for the option none
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! In all 3 modes the lines should be:
First line
- new line
Line 1
- new line
Line 2
- new line
Line 3
- new line
["First line", "", "Line 1", "", "Line 2", "", "Line 3", ""]
I just tested with pure Rich:
from rich.console import Console
from rich.markdown import Markdown
console = Console()
content = """
First line
Line 1
Line 2
Line 3
"""
md = Markdown(content)
print("--- Markdown ---")
console.print(md)
print("--- Rich markup ---")
console.print(content, markup=True)
print("--- plain ---")
print(content)
And the output is almost always the same:
$ python demo.py
--- Markdown ---
First line
Line 1
Line 2
Line 3
--- Rich markup ---
First line
Line 1
Line 2
Line 3
--- plain ---
First line
Line 1
Line 2
Line 3
The main difference at least in this script is that for Markdown Rich doesn't add an extra line on top and below. But the main question we have right now are lines in between, and in all cases those are printed consistently.
So yep, that's a bug here in Typer.
), | ||
], | ||
) | ||
def test_markup_mode_newline_issue447(mode: str, lines: List[str]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the splitting of the strings is different for all 3 options, with rich
keep each input line as such, markdown
merging everything that is not header, and none
keeping all double newlines as separators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, I think this starts at the design in Click, I understand that it automatically unwraps single line breaks (single new line chars surrounded by regular text) probably just because there's the convention of keeping code lines shorter than some size, e.g. 88 characters. And that if a docstring goes beyond that, a new line with text continuing right below, would be just formatting, but it all would belong in the same paragraph.
And Markdown has the same behavior in some renderers (e.g. the preview in VS Code), I imagine it's also to have in mind those conventions of keeping files with lines under 88 characters. ...although it doesn't have the same behavior here in the GitHub comment preview 😅.
The current code (that was a copied from from click-rich) is probably trying to replicate that behavior while trying to keep the specified format.
To keep consistency with plain Click, and those conventions, I think it makes sense to automatically unwrap single line breaks into a single paragraph (line), and separate those paragraphs with double new lines (so, an empty string in between those lines), so there's a space between them.
So, for rich
mode, we would need to manually unwrap the single line breaks that are currently preserved.
For markdown
mode, we should keep unwrapping single line breaks, but we should split double line breaks as different paragraphs, and when rendering, separate those with new lines.
For None
mode (which is just Click's default), I think it would make sense to keep unwrapping single line breaks, but to split paragraphs by double line breaks (so, an extra empty string in between).
I think that would make the 3 modes more or less consistent.
Note: all this idea of unwrapping is challenged by the test and comments below, about lists.
pytest.param( | ||
"markdown", | ||
["First line", "", "• 1", "• 2", "• 3", ""], | ||
marks=pytest.mark.xfail(), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test currently fails on master
, with the output being
['First line', '', '• 1 - 2 - 3', '']
-> this happens mostly because the newlines are not parsed correctly, after which the bullet points aren't recognized correctly. PR #815 attempts to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, this gets very tricky. 🤔
Because I see how I would intuitively expect the lines to be split, visually it's just a list.
But then, that would go against the automatic text unwrapping that Click would do and that I was mentioning above. 🤔
One option is to expect it all to be unwrapped, so that if a user wants a list they would separate the items by new lines.
The other option is to challenge the first idea of automatically unwrapping new lines into a single paragraphs, and instead render things as-is, even if they have internal splits.
A third option I wouldn't want is adding a special case for lists, but that would open many edge cases and many other possible edge cases to handle, so I think I wouldn't like this idea, maybe better to decide on one format or the other.
pytest.param( | ||
"markdown", | ||
["First line", "", "• 1", "• 2", "• a", "• b", "• 3", ""], | ||
marks=pytest.mark.xfail(), | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test currently fails on master
, with the output being
['First line', '', '• 1 - 2 - a - b - 3', '']
-> this happens mostly because the newlines are not parsed correctly, after which the bullet points aren't recognized correctly. PR #815 attempts to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea to have more tests and a more established and expected behavior!
I added some comments in your comments, thinking about the best way to proceed with unwrapping lines or not.
Now I was just reading the docstrings PEP: https://peps.python.org/pep-0257/, it mentions that the first line should be a "summary", followed by a blank line, and that it should fit on a single line.
So I guess that for the first line it's up to us.
The docstrings PEP doesn't say anything about wrapping long lines or not, or how to consider it.
But PEP 8 (https://peps.python.org/pep-0008/#maximum-line-length) does, and explicitly mentions docstrings:
[...] it is okay to increase the line length limit up to 99 characters, provided that comments and docstrings are still wrapped at 72 characters.
So I guess it's kind of expected that people would wrap docstring content with internal new lines without signifying that it means new paragraphs.
So, although I don't like it too much, I'm inclined to not support lists separated by a single line and require them to have blank lines in between.
There's a few open PRs around markdown formatting and parsing that I want to review, but first I would like to establish what exactly is the desired outcome for different docstrings and different settings of
rich_markup_mode
.This PR is supposed to capture the ideal output for each test, using a
pytest.mark.xfail
marker for those cases that currently fail onmaster
. Any potential fixes will then be contributed in subsequent PRs.