-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use more efficient regex engine wherever possible #1182
Comments
Is |
I have a question about tpre? Would it be hard to use this against non null-terminated strings. It is more efficient sometimes to deal with substrings rather than deep-copy the contents etc. So you end up with string slivers that can't be null terminated because they exist inside of another string that is still being used. Would it be possible to adapt a regex engine to work with this? |
are you tlaking about the strings that you want to match? in that case it's easy to change it |
Yeah, the strings to match might be encoded as |
I just added that functionality and made it ANSI C |
also you are not reachable on discord When can / should I start integrating my compiler backend? |
Oh, sorry about this disconnect. I think I need to enable a notification option or something, I just didn't notice that there was a new message today. I'll look into options so I don't miss messages. For the compiler backend:
I'll add a priority ticket to quote this TODO |
how does foreach work? |
I am going to rewrite the C backend in LSTS after finishing cleaning up Core. This new backend should only use stable API interfaces, so that will help keep me honest about what interfaces need to be stabilized. Currently there is information going back and forth between Core and the C Backend. That needs to be kept separate or reentrant. The goal with this rewrite is
|
I need to rewrite for-each so that it doesn't depend on Opinions or ideas on how to write |
you could do something like:
but that is a bit stupid because of mutability |
My main opinion on iterators is that they shouldn't pin to an implementation. No explicit Vector. No explicit List etc. The core point of specialization is that it allows you to write algorithms using only the core parts explicitly and the rest can be inferred.
x is a left hand side. y is iterable. z is some expression. The core design here that is still up in the air is "what does it mean for an object to be iterable." Does it need to specifically be a vector or list. Why can't the loop accept both etc. My first thought was head / tail functions, but that means two separate methods. |
any objections? |
This interface is kinda cool because the .next function can be stateless. Random number generator? Sure you can iterate over that. |
I fixed all known bugs in the regex engine and added some tests |
wait do we need named capture groups? |
No, not right now. |
Reasons:
Proposal:
Use tpre to compile the regex to a byte array at compile time, and at runtime you only need 10 lines of code to execute the regex match.
When tpre can not compile a regex pattern (which only happens for very rare regex patterns), it should falls back to the stdlib's regex.h
tpre is at least twice as fast as un-jitted PCRE2
The text was updated successfully, but these errors were encountered: