-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new grammar renderer #1787
base: master
Are you sure you want to change the base?
Conversation
Just fixing some small consistency and spacing mistakes.
This rule was misnamed, colliding with the existing CfgAttrAttribute.
This renames IsolatedCR to CR. I felt like it wasn't exactly necessary since we have rewritten things so that it is clear that there is an input transformation which resolves this (`input.crlf`). We also never really defined what it meant. I also felt like there was room for confusion. For example, an input containing `CR CR LF LF` would get normalized to `CR LF`. The `CR` there is not isolated.
This removes all backslash escaped characters. This helps to avoid confusing similarities with a literal backslash followed by a character versus the interpreted escaped character.
I don't exactly know why this was placed there, but we operate under the assumption that all lexical characters immediately follow one another.
This introduces a new terminal kind that I'm calling a "prose" which describes what the terminal is. This is inspired by the IETF format which uses angle brackets to describe terminals in English.
The grammar almost always uses lowercase, so let's standardize on that.
This helps to standardize how suffixes are written. Normally they do not use parentheses, and visually I don't think they entirely necessary.
These two nonterminals were using the wrong name for the productions for BlockExpression and LiteralExpression.
This changes the keyword listings so that they are just lists instead of lexer rules. We never used the named rules, and I don't foresee us ever doing that. Also, the `IDENTIFIER_OR_KEYWORD` rule meant that we never needed to explicitly identify these keywords as lexer tokens. This helps avoid problems when building the grammar graph for missing connections.
Per our style, edition differences are supposed to be separated out into an edition block.
These were defined in prose below, but defining them here allows us to easily refer and link to them.
This is intended to help define what a "token" is via the grammar (and to fill a missing hole in our token definition). I waffled on how to define delimiters, whether they should be separate somehow. In practice I think it should be fine to clump them all together. This mainly only matters for TokenTree which already excludes the delimiters.
This adds a grammar rule that collects all the reserved token forms into a single production rule so that we can define what a "token" is by referring to this.
This defines a Token in the grammar so that we can easily refer to it (and to make it easier to see what all the tokens are).
We no longer represent characters via escape sequences. These can be confused with the literal two bytes of backslash followed by a character. See the "common productions" list for how these are now referred to.
The AFAICS there are cases where the grammar diverges from its graphical representation with respect to repeated elements. In the two examples below, the diagram only allows for for at least two consecutive |
From the live demo I can see that
What about doing it like this?
This uses one less path and can be concatenated with a previous
The main problem is that |
With respect to It's possible to implement |
I see, so that's already supported and just a matter of generating the proper diagram downstream. |
This adds an extension to mdbook-spec that will parse code-blocks in a BNF-style grammar into a rendered format, in both markdown or as railroad diagrams.
This adds the hooks to toggle the visibility of the railroad grammar. The status is stored in localstorage to keep it sticky.
This fixes it so that rule links work correctly if there is more than one space in a reference definition.
Not sure how this got missed.
Oops, thanks! I messed up all the repeats. I'm a bit uncertain on how to handle the non-greedy ones, but I think a label works ok for now. Same with the |
@lukaslueg: I'm curious if you have thoughts about what would make a good graphical representation for this. As I'm sure you saw, we use this pattern a lot, and I had found myself wondering about whether there might be a good visual way to encode this. |
Some comments and suggestions, which you hopefully find constructive
|
This comment was marked as resolved.
This comment was marked as resolved.
Yes, makes sense. |
See here for a quick and dirty example of how an ![]() |
That's quite nice, yes. That way, the "except for" nodes aren't in the main railroad, which is what makes the other ways awkward. |
I didn't want to try to add an unused grammar rule here, so just point to the notation chapter which has an example.
This updates some of the description for the new grammar renderer.
This just ends up with lots of duplicates in the search results which I don't find particularly helpful.
Railroad renders these in reverse order, but our grammar isn't written that way.
This is the suggestions from lukaslueg
Thanks @lukaslueg! I went ahead and added your suggestion for the Except clauses. To fix the repeat expressions, I had to essentially reverse the Sequence blocks inside a repeat. I couldn't find any other way to get around the HDir invert. As for some of the other suggestions here, such as rendering nodes inside comments, those sound like good things to try in the future. That in particular might be challenging because the content is written in markdown. |
This introduces a new grammar renderer. Instead of trying to write the grammar in markdown/html hybrid, this introduces a new syntax that is parsed by the mdbook-spec plugin. This grammar is then converted into markdown/html hybrid, and also to railroad diagrams.
There are a lot of changes here (and some can be split into separate PRs if desired). A general overview of what to see here:
docs/grammar.md
file for a complete description.mdbook-spec/src/grammar/parser.rs
into an internal representation.mdbook-spec/src/grammar/render_markdown.rs
, and railroad diagrams inmdbook-spec/src/grammar/render_railroad.rs
.mdbook-spec/src/grammar.rs
. There are several pieces here:[FunctionParameters]
. Link definitions are automatically added to every page.I'd like to thank @lukaslueg for creating the railroad library which made this possible.
Closes #221
Closes #398
Closes #596
Closes #1513
Closes #1677