Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to match variable length indents? #97

Open
anentropic opened this issue Jul 13, 2019 · 3 comments
Open

How to match variable length indents? #97

anentropic opened this issue Jul 13, 2019 · 3 comments

Comments

@anentropic
Copy link
Contributor

anentropic commented Jul 13, 2019

I am trying to perform a select like:

.select(r"funcdef< any* ':' suite< '\n' firstnode='    ' any* > >")

This indent which I have labelled firstnode will have the mypy Python 2 type annotation comment attached to it as .prefix if the funcdef has one

My problem is that for more deeply nested functions the indent string I need to match may be 8 spaces, or 12.

It seems like it'd be really nice to be able to use an abstract INDENT token in the pattern rather than the specific string of space chars.

I am assuming there is nothing in the pattern grammar which will help here? I assume if I do like ' '+ (four spaces, repeated - not even sure if that is valid) it won't match ' ' (eight spaces)?

@anentropic
Copy link
Contributor Author

anentropic commented Jul 13, 2019

I have been using the http://svn.python.org/projects/sandbox/trunk/2to3/scripts/find_pattern.py script to help with generating patterns based on existing source code, as suggested in http://python3porting.com/fixers.html

It's quite easy to modify this script to give output like
funcdef< any* COLON suite< NEWLINE INDENT any* > >

i.e. basically replacing all the string literal tokens using the node.type int value and the constants defined here:
https://github.com/python/cpython/blob/master/Lib/lib2to3/pgen2/token.py

but this is not recognised in bowler select pattern, I guess because of limitation of lib2to3 (I had initially thought because of name=NAME in some of the bowler examples that these constants may be accepted)

@anentropic
Copy link
Contributor Author

anentropic commented Jul 13, 2019

my option at the moment seems to be to manually enumerate all the indents up to some max size e.g.

funcdef< any* ':' suite< '\n' firstnode=('    '|'        ') any* > >

this then allows me to select top-level and nested functions with the same selector

(with caveat that in the modify callbacks capture['firstnode'] now returns a single-element list instead of just the element)

@thatch
Copy link
Contributor

thatch commented Jul 15, 2019

Yeah, I hear you. The 2to3 grammar only supports a few names, see https://github.com/python/cpython/blob/master/Lib/lib2to3/patcomp.py#L178 (TOKEN is the leaf equivalent to any).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants