-
-
Notifications
You must be signed in to change notification settings - Fork 509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Use JS regular expression syntax, not path-to-regexp
, in RegexRouter
#3170
base: main
Are you sure you want to change the base?
Conversation
47c4b91
to
869cbd6
Compare
869cbd6
to
3320600
Compare
/** | ||
* Add a route to the router, using the express style path syntax | ||
*/ | ||
addPath(path: string, handler: PathRouteHandler): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether we should be keeping pathToRegexp and this method to use for the osc api. Because that api is using typical slash separated components, so wouldn't be fighting the library?
The reason I started using pathToRegexp
was to make the regexes simple and resilient with minimal effort (the library will have much better redos and similar protections than we will think to write)
I don't know though, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My taste -- I emphasize this is a taste as this is your turf 🙂 -- is to prefer built-in language function over libraries, when they do similar things. Regular expression syntax is in the JS author's toolkit. A new regular expression language is not. (And theoretically could have its own corner cases.)
I also prefer there be fewer ways to do it, not more. Two distinct pattern mechanisms seems worse than one -- perhaps than either one.
Regular expression pathological cases happen when you have repetition with possibility of matching multiple ways. None of these cases have that. ReDOS simply isn't a worry here.
Honestly, tho, sequentially testing a list of regular expressions isn't the right way for either of these cases compared to actual parsing. For OSC I'd address.split('/')
and iterate through the components. For TCP I'd incrementally tokenize the string. I'd use some switch
es to invoke the intended operation -- and return informative errors for contextual errors.
But I'd never start writing that patch if I didn't know it'd be accepted. 🙂 Regular expressions aren't right for this, but for a fixed list of a couple dozen syntaxes they're bearable, in a patch with the least code most likely to be accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new regular expression language is not. (And theoretically could have its own corner cases.)
Honestly, tho, sequentially testing a list of regular expressions isn't the right way for either of these cases compared to actual parsing.
True, but I found pathToRegexp
because I was curious how express (or koa?) did its route matching. its a pretty widely used and known syntax because of both of those.
Which I think is where I started on this; wanting to setup the osc api in a way that is familiar to adding routes to express.
I think that this regex implementation is much more easily readable than tokenizing would be. Thats the nice thing with this pathToRegexp implementation is that with just a little knowledge on the syntax of the paths, it is easy to skim read each rule being checked and find what you are looking for.
I think this is slightly lost in this PR because it hides some things (format of location), but it makes it more readable in other ways as it highlights which bits of the string are matching something while keeping the exact syntax out of it.
So I am tempted by this, just not convinced by tokenizing
I sniped myself. 🙂
I initially used literal regular expressions, in part because it'd make things like case-insensitivity explicit. (Maybe TypeScript will generate precise types as regards named capture groups sometime.) But regular expression patterns aren't composable, so I quickly switched to template literals. (This also meant I could change
Element
's definition when I learned that existing tests weren't nearly as restrictive in the enforced character sets for particular names as I'd assumed.)TextRequiringPrecedingSpace
is defined as legibly as I could make it, albeit somewhat verbosely. (I don't get the sense lookbehind assertions are in common enough use to be self-explanatory if I just inlined them.)