Joop Leo's optimization? #24

orlp · 2018-12-22T12:53:30Z

It is stated that YAEP doesn't use Joop Leo's optimization, and that right recursion is fast anyways. And because of the lookahead I do believe this to be true for some grammars, but not all LR(k) grammars. For example, the following LR(2) grammar:

S -> A 'a' 'b'
A -> 'a' A
A ->

Generates increasingly large set cores when parsing "aaa...aaab", leading to (at least) quadratic behavior.

Perhaps an optimization similar to Joop Leo's could be made, making YAEP linear time for all LR(k) grammars?

The text was updated successfully, but these errors were encountered:

vnmakarov · 2018-12-25T02:54:47Z

Thank you for your feedback.

Yes, in brief, I consider implementation of Joop Leo's optimization
or something similar.

In fact, I believe Joop Leo's work is more important that one
described in Practical Earley Parser article. Saying that I don't
know when I actually will start to work on it and when it could be
ready. It is not a high priority project for me.

I tried to implement an Earley Parser which could be useful for
programming languages implementation. Therefore I spend more efforts
on implementing simple direct translation and a good error recovery. I
also don't believe that LR(k) parsers (where k > 1) are useful for such
tasks. Usually, if a programming language grammar is not LR(1), it is
also not LR(k) k > 2 (e.g. C or C++).

But still it would be nice to parse LR(k > 1)-language by Earley
parser in linear time.

Although I wanted to implement my Earley parser as a swiss knife for
parser implementations, I found that its usage requires a lot of
expertise for non-trivial parsers. It is easy to write an ambiguous
grammar (e.g. during modification of an original grammar) when you
have not intention to do this.

By the way, I have also yacc-like parser MSTA
(https://github.com/dino-lang/dino/tree/master/MSTA) which can
generates effective LR(k), k > 1 parsers. This compiler-compiler does
a lot optimizations, like extracting LALR and regular parts of grammar
and implements parsing of them more effectively. Deterministic parser
generators are more safe for the problem I mentioned above. So I'd
prefer to use MSTA than my Earley parser for a typical programming
language implementation.

But even more I prefer to write manual parsers with backtracking if
necessary because I like to use original grammars from the programming
language definition. Other approaches require a lot of grammar
modifications (e.g. original Javascript grammar is actually a grammar
generator or can be considered as context sensitive one).

andres-erbsen mentioned this issue Mar 10, 2019

fail fast when ambiguity is detected? #25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Joop Leo's optimization? #24

Joop Leo's optimization? #24

orlp commented Dec 22, 2018

vnmakarov commented Dec 25, 2018

Joop Leo's optimization? #24

Joop Leo's optimization? #24

Comments

orlp commented Dec 22, 2018

vnmakarov commented Dec 25, 2018