refactor: Use JS regular expression syntax, not `path-to-regexp`, in `RegexRouter` #3170

jswalden · 2024-12-08T08:57:11Z

I sniped myself. 🙂

I initially used literal regular expressions, in part because it'd make things like case-insensitivity explicit. (Maybe TypeScript will generate precise types as regards named capture groups sometime.) But regular expression patterns aren't composable, so I quickly switched to template literals. (This also meant I could change Element's definition when I learned that existing tests weren't nearly as restrictive in the enforced character sets for particular names as I'd assumed.)

TextRequiringPrecedingSpace is defined as legibly as I could make it, albeit somewhat verbosely. (I don't get the sense lookbehind assertions are in common enough use to be self-explanatory if I just inlined them.)

…`RegexRouter`.

Julusian · 2024-12-15T13:05:16Z

companion/lib/Service/RegexRouter.ts

-	/**
-	 * Add a route to the router, using the express style path syntax
-	 */
-	addPath(path: string, handler: PathRouteHandler): void {


I'm wondering whether we should be keeping pathToRegexp and this method to use for the osc api. Because that api is using typical slash separated components, so wouldn't be fighting the library?

The reason I started using pathToRegexp was to make the regexes simple and resilient with minimal effort (the library will have much better redos and similar protections than we will think to write)

I don't know though, what do you think?

My taste -- I emphasize this is a taste as this is your turf 🙂 -- is to prefer built-in language function over libraries, when they do similar things. Regular expression syntax is in the JS author's toolkit. A new regular expression language is not. (And theoretically could have its own corner cases.)

I also prefer there be fewer ways to do it, not more. Two distinct pattern mechanisms seems worse than one -- perhaps than either one.

Regular expression pathological cases happen when you have repetition with possibility of matching multiple ways. None of these cases have that. ReDOS simply isn't a worry here.

Honestly, tho, sequentially testing a list of regular expressions isn't the right way for either of these cases compared to actual parsing. For OSC I'd address.split('/') and iterate through the components. For TCP I'd incrementally tokenize the string. I'd use some switches to invoke the intended operation -- and return informative errors for contextual errors.

But I'd never start writing that patch if I didn't know it'd be accepted. 🙂 Regular expressions aren't right for this, but for a fixed list of a couple dozen syntaxes they're bearable, in a patch with the least code most likely to be accepted.

A new regular expression language is not. (And theoretically could have its own corner cases.)
Honestly, tho, sequentially testing a list of regular expressions isn't the right way for either of these cases compared to actual parsing.

True, but I found pathToRegexp because I was curious how express (or koa?) did its route matching. its a pretty widely used and known syntax because of both of those.
Which I think is where I started on this; wanting to setup the osc api in a way that is familiar to adding routes to express.

I think that this regex implementation is much more easily readable than tokenizing would be. Thats the nice thing with this pathToRegexp implementation is that with just a little knowledge on the syntax of the paths, it is easy to skim read each rule being checked and find what you are looking for.

I think this is slightly lost in this PR because it hides some things (format of location), but it makes it more readable in other ways as it highlights which bits of the string are matching something while keeping the exact syntax out of it.

So I am tempted by this, just not convinced by tokenizing

jswalden force-pushed the regexprouter branch from 47c4b91 to 869cbd6 Compare December 8, 2024 09:00

jswalden mentioned this pull request Dec 8, 2024

[BUG] yarn test since the switch to vitest no longer ignores any folders/symlinks added to module-local-dev/ #3171

Closed

2 tasks

refactor: Use JS regular expression syntax, not path-to-regexp, in …

3320600

…`RegexRouter`.

jswalden force-pushed the regexprouter branch from 869cbd6 to 3320600 Compare December 12, 2024 04:08

Julusian reviewed Dec 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Use JS regular expression syntax, not `path-to-regexp`, in `RegexRouter` #3170

refactor: Use JS regular expression syntax, not `path-to-regexp`, in `RegexRouter` #3170

jswalden commented Dec 8, 2024

Julusian Dec 15, 2024 •

edited

Loading

jswalden Dec 16, 2024

Julusian Dec 16, 2024

refactor: Use JS regular expression syntax, not path-to-regexp, in RegexRouter #3170

Are you sure you want to change the base?

refactor: Use JS regular expression syntax, not path-to-regexp, in RegexRouter #3170

Conversation

jswalden commented Dec 8, 2024

Julusian Dec 15, 2024 • edited Loading

Choose a reason for hiding this comment

jswalden Dec 16, 2024

Choose a reason for hiding this comment

Julusian Dec 16, 2024

Choose a reason for hiding this comment

refactor: Use JS regular expression syntax, not `path-to-regexp`, in `RegexRouter` #3170

refactor: Use JS regular expression syntax, not `path-to-regexp`, in `RegexRouter` #3170

Julusian Dec 15, 2024 •

edited

Loading