-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential features based on oniguruma-to-es
#18
Comments
I've been keeping an eye on I could add a feature to show what the onig regexes would look like in JS do the error messages give the position of the error? are there any plans for JS to onig? have you tried parsing the grammars in this repo? would there be support for other versions of do you currently support all characters in |
Thanks, glad to hear it. 😊
I think that would be very cool. For this, it might help to be aware of the With default options: toDetails('.++')
/* →
{ pattern: '(?:(?=($E$[^\\n]+))\\1)',
flags: 'v',
options: {
useEmulationGroups: true,
},
}
*/
toRegExp('.++')
/* →
new EmulatedRegExp('(?:(?=($E$[^\\n]+))\\1)', 'v', {
useEmulationGroups: true,
})
*/ With toDetails('.++', {avoidSubclass: true})
/* →
{ pattern: '(?:(?=([^\\n]+))\\1)',
flags: 'v',
}
*/
toRegExp('.++', {avoidSubclass: true})
/* →
/(?:(?=([^\n]+))\1)/v
*/
// Alternatively, even when not using `avoidSubclass` you can do...
toRegExp('.++').toString()
/* →
'/(?:(?=([^\\n]+))\\1)/v'
...or read the regexp's `.source` and `.flags`
*/ The latter values don't include the Note that if you pass values from |
Reporting error positions
No. That would be nice and I'd welcome contributions that enable this, but it would be difficult because in some cases (subroutines are an example) the generated results are fairly scrambled compared to the input, and errors can come from the tokenizer, parser, transformer, or code generator. But if you only wanted to know whether it's a valid Oniguruma regex (minus features that That said, maybe the errors are still useful without a position? In general, JS RegExp → Oniguruma
Not currently. JS RegExp to Oniguruma would be a cool feature, but it has more limited use cases that I personally don't have. However, I would welcome it if you wanted to collaborate on this. Compared to going from an Oniguruma AST to a JS RegExp, going from a JS RegExp AST to an Oniguruma pattern would be dramatically simpler. So most of the complex work would be in building a JS RegExp AST. But then, there are of course existing JS RegExp AST builders. The best / most up to date one is probably eslint-community/regexpp. If you used that, going from JS RegExp to Oniguruma wouldn't need to be a huge project like Aside: Eventually I'd love to create a lightweight AST builder for Regex+ syntax. Regex+ syntax is a strict superset of JS RegExp syntax with flag Support for absence and conditionals
No, but I generally know the currently-missing features. They're documented in
Absent repeaters and absent expressions can be emulated, and I plan to support them in future versions. See the tracking issue here: slevithan/oniguruma-to-es#13 😊 Some conditionals can be emulated. E.g. it would be pretty straightforward to change a basic case like Aside: Oniguruma edge cases make the Emulating older versions of Oniguruma
Supporting older versions is not currently planned but is possible. I'd welcome contributions that add this in a maintainable way. Invalid JS identifiers as group names
Supporting group/subroutine/backreference names that are invalid JS identifiers would require:
I'm not currently planning to support this since I consider it low priority (and I'd encourage TM grammar authors that use invalid JS identifiers as groups names to update their regexes), but I'd welcome contributions that added support for this. |
Aside: It's obvious you have extremely in-depth and hard-won knowledge of Oniguruma's nuances and complexity. Even if you don't end up using Edit: Thanks for all the fantastic and detailed issues you've filed!! They've now all been addressed, with fixes published in v1.0.0. |
Context:
oniguruma-to-es
is an advanced Oniguruma to JavaScript transpiler that's written in JS. It was first released recently, and has quickly improved. It's used by Shiki's JS engine and supports more than 97% of TM grammars provided with Shiki (it's handling more than 99.9% of regexes in these grammars, but one unsupported or invalid regex removes support for the grammar). Some details are here about supporting the few remaining grammars, if you're interested.Do you think there might be opportunities to enhance TmLanguage-Syntax-Highlighter using
oniguruma-to-es
? For example:oniguruma-to-es
for invalid Oniguruma patterns could potentially be helpful when writing/debugging grammars.Happy to answer any questions. But feel free to close this without comment if you don't think it's a good fit.
The text was updated successfully, but these errors were encountered: