Opportunistic zero copy parsing for escaped_transform using `Cow` #543

VorpalBlade · 2024-06-15T11:56:13Z

Please complete the following tasks

I have searched the discussions
I have searched the open and rejected issues

winnow version

0.6.13

Describe your use case

Something like this works:

/// Unquoted string value
fn unquoted_string_with_escapes(i: &mut &str) -> PResult<String> {
    escaped_transform(take_till(1.., [' ', '\t', '\n', '\r', '\\']), '\\', escapes).parse_next(i)
}

fn escapes<'input>(i: &mut &'input str) -> PResult<&'input str> {
    alt((
        "n".value("\n"),
        "r".value("\r"),
        "t".value("\t"),
        " ".value(" "),
        "\"".value("\""),
        "\\".value("\\"),
    ))
    .parse_next(i)
}

But this doesn't compile:

fn unquoted_string_with_escapes<'input>(i: &mut &'input str) -> PResult<Cow<'input, str>> {
    escaped_transform(take_till(1.., [' ', '\t', '\n', '\r', '\\']), '\\', escapes).parse_next(i)
}

fn escapes<'input>(i: &mut &'input str) -> PResult<&'input str> {
    alt((
        "n".value("\n"),
        "r".value("\r"),
        "t".value("\t"),
        " ".value(" "),
        "\"".value("\""),
        "\\".value("\\"),
    ))
    .parse_next(i)
}

Describe the solution you'd like

It would be nice if escaped_transform could have zero copy parsing when the string happens to contain no escapes (which is the common case for my use case).

Alternatives, if applicable

I could try to write my own parser from scratch instead of using escaped_transform and make it support this. I would rather not and also it would be nice to have something that is reusable and available for others too.

Additional Context

No response

epage · 2024-06-20T01:27:07Z

This would require having more knowledge of the underlying types than is currently supported. At minimum its a breaking change. I suspect this would require a new trait which I'm hesitant to add for this one specific parser.

My own biases also lean towards not addressing this. When I'm prototyping, I use whatever gets the job done. If I'm writing a parser for a grammar, I implement everything myself so there is a clear connection between the grammar and my code (example). This makes me see escaped_transform as more of a tool for prototyping than one for performant production code.

VorpalBlade · 2024-06-26T20:48:45Z

So I had a bit of a think about your response. And I don't think I agree. To me it seems that functionality should be as fast as it can possible be. You shouldn't have to reinvent the wheel to be fast in Rust.

Unfortunately that would (if it needs to know the underlying type more precisely) currently require specialisation as I understand it. That would indeed be a blocker for making escaped_transform do this. Otherwise it could perhaps be a separate function escaped_transform_str?

epage · 2024-06-27T00:15:20Z

Yes, in general winnow aims for speed. However, it isn't our highest priority. For example, we likely could be made even faster with GATs, like chumsky and nom v8 use, but they run counter to more important principles, like being a toolbox instead of a framework and having the built-in parsers be written in a way similar to how users would.

We have a general implementation of escaped_transform. It isn't the fastest for all cases but it supports a large variety of cases. That leaves us with

Narrow supported cases
Have a second implementation
Do nothing
Remove it completely

For me, it feels too specialized to have a second implementation (and, in general, I've been trying to avoid that).

I also feel that the current version has the right ergonomics for most users, so I would prefer not to narrow supported cases.

I'm assuming people won't want it removed.

That leaves "do nothing".

VorpalBlade added the C-enhancement Category: Raise on the bar on expectations label Jun 15, 2024

VorpalBlade changed the title ~~Opportunistic zero copy parsing for escaped_transform~~ Opportunistic zero copy parsing for escaped_transform using Cow Jun 15, 2024

epage added A-combinator Area: combinators M-breaking-change Meta: Implementing or merging this will introduce a breaking change. labels Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opportunistic zero copy parsing for escaped_transform using `Cow` #543

Opportunistic zero copy parsing for escaped_transform using `Cow` #543

VorpalBlade commented Jun 15, 2024

epage commented Jun 20, 2024

VorpalBlade commented Jun 26, 2024

epage commented Jun 27, 2024

Opportunistic zero copy parsing for escaped_transform using Cow #543

Opportunistic zero copy parsing for escaped_transform using Cow #543

Comments

VorpalBlade commented Jun 15, 2024

Please complete the following tasks

winnow version

Describe your use case

Describe the solution you'd like

Alternatives, if applicable

Additional Context

epage commented Jun 20, 2024

VorpalBlade commented Jun 26, 2024

epage commented Jun 27, 2024

Opportunistic zero copy parsing for escaped_transform using `Cow` #543

Opportunistic zero copy parsing for escaped_transform using `Cow` #543