-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add indentation-aware TokenBuilder and Lexer #1578
Add indentation-aware TokenBuilder and Lexer #1578
Conversation
This is useful for languages where indentation is used as a block delimiter, such as Python
For consistency with the rest of the codebase
Whoops, looks like I messed up a bit with the imports there. That's what I get for using github.dev without cloning locally, I guess 😅. I'll fix it in a sec |
@msujew Should hopefully be fine now 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution @aabounegm!
Do you mind creating some tests cases for the new lexer?
Co-authored-by: Mark Sujew <[email protected]>
Co-authored-by: Mark Sujew <[email protected]>
Sure, no problem, but that might take me some time (not sure if I will be able to manage it today) |
@aabounegm All good, take your time. I appreciate it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! From a code organization standpoint, I'd prefer the two new service classes to be in a new file parser/indentation-aware.ts
.
It was causing issues with the "super" call in the constructor in tests
They provide no value, but may throw off the parser
Thanks for your comments, I have now added unit tests and moved the new classes to a new file as suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aabounegm, looks pretty good to me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your fixes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This is useful for languages where indentation is used as a block delimiter, such as Python.
I will try to add a recipe for it in the documentation (and maybe some unit tests) later on.
The code was heavily inspired by the Python indentation chevrotain example, but modified to be more extensible and hopefully fitting for Langium.
Usage example (just a snippet, not full grammar):
The main drawback is that the whole language now becomes indentation-sensitive, even areas which would normally not be white-space sensitive, such as the following Python example:
Multi-mode lexing can be leveraged to overcome this, and if I find a generic way to integrate it with the code from this PR, I will follow it up with another one that does so.
Closes #1016.
Related to #608, #663, #782, and #1085.