Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Groovy Parser correctly handle nested parenthesis #4718
Make Groovy Parser correctly handle nested parenthesis #4718
Changes from all commits
b04247c
4191337
8963f1e
c352e7c
09aa9c3
0de94eb
2b638a8
d7afb0c
8376b4e
da61c8d
d292e34
4704580
6951ed4
e93659f
3d7d9ec
d488126
e0f2efa
a1b1a39
fde44e6
05b07e9
560a2a8
710a28d
b8add9b
2184912
cd6e229
0a61029
e20e95c
368a9af
19502b3
9b5a973
a1c30b2
3e95a86
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't reviewed in detail but at first glance it doesn't seem to be aware of comments so any parentheses in comments might throw it off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat surprised to see
source
andcursor
as separate arguments here when they are merely used to do this substring. Would it make sense to change the method signature here toThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had that before, but reverted it back to current state. I think it more clear if the GroovyParserParentheseDiscoverer is a completely separate entity. So no former logic is needed to determine the wrapping. Just pass everything you have (thus the node, the source code of the entire file and the current place you are parsing right now) and let the function discover it all.
Having the cursor argument also prevents from misinterpretation to function, because you need to provide the starting point (that is the cursor) of parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this can be simplified. For Python I had the same basic problem to solve, but I didn't have any "parentheses level" from the AST or similar. Additionally, in Python the parentheses for tuples and generator expressions are sometimes optional, which doesn't make things simpler.
Anyway, rather than counting the "parentheses levels" up front, I added something like a
maybeParenthesized(node, delegate)
method which does something like this:whitespace()
to parse whitespace (and comments) at cursor into aSpace
(
delegate
and we are donemaybeParenthesized()
recursivelywhitespace()
and check if we are at a)
J.Parentheses
For a language like Java or Python, where we can expect the source files to be reasonably large, I would definitely do something like this, as the substringing and regexp matching won't be free in terms of performance. But for Groovy that would probably not be a big concern, as scripts tend to be small and few.
I will let you think a bit about this and don't want to force you into a different solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, here is the Python code in question. It isn't pretty, so in case you decide to give this approach a try, I would only look at the Python code if you get stuck, as you will probably find a cleaner solution for your parser: https://github.com/openrewrite/rewrite-python/blob/b42101961bbc9daf21d54fe803dd9eaec59c3184/rewrite/rewrite/python/_parser_visitor.py#L1950-L2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second Knut's comments and like the approach he describes. I worry that these regexes will be slow and error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one tricky situation I can think of are Java-style type casts, as the first token there is also a
(
. So if the AST node is a Java-style type cast (assuming they can be distinguished from Groovy-style type casts in the AST), the logic could for example pop that last element from the stack and reset the cursor before calling the corresponding visit method (or do it in the method itself). There are also other alternative solutions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, your implementation seems like a better idea @knutwannheden. I would rather first merge this and then do a second round with improvements (or better said, go for the new/better implementation). Currently we don't support nested parenthesis, which just breaks scripts that use them. Having this sub-ideal implementation makes them at least runnable, though with a performance loss.
Then I can start working on the better implementation
--
Edit: Talked with @knutwannheden about this, we concluded it's better to not go forward yet:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't quite as easy as I made it out to be and I now remember the struggles I had with Python here. The difficulty arises because we don't know at what AST level the parentheses are. Here that parentheses level may come in handy.