Left-recursion allowed in circumstances that will always lead to a stack overflow. #54

otac0n · 2014-09-21T17:36:39Z

expr -memoize
  = #STATE{ state["any"] = 1; } expr "a"
  / "b"

Since this changes the state of the cursor just before recursing into expr, and since memoization is used as an implementation detail of left-recursion, this alters the state of the left-recursion algorithm.

That's bad.

Basically, any left recursive call that can be preceded by a state change is a no-no.

Need to add a detector and a compiler error for this.

The text was updated successfully, but these errors were encountered:

otac0n · 2014-10-12T08:12:46Z

I'm not sure if this is strictly computable, but my hunch is that it is.

Here's my half-baked algorithm for figuring this out:

Build a control flow graph of the entire grammar.
Find all nodes relating to #STATE expressions in the graph.
Find all succeeding nodes from those state nodes.
Mark these nodes as being reached by the aforementioned #STATE expression.
Repeat the following process until no reachable nodes can be added:
1. Find any zero width node that is reached by a #STATE expression
2. Find the succeeding nodes from those nodes
3. Mark these nodes as being reached by the #STATE expressions
For each left-recursive rule
1. Find all paths* through the control flow graph that reach a name node referring to the rule.
2. Filter these paths to just those that contain only zero-width expressions
3. Filter these paths to just those that end with a node that is reachable by any #STATE expression
4. The remaining paths are guaranteed to stack overflow (unless the programmer adds code expressions to prevent it).

The definition of *all paths above is dubious, since there could easily be infinitely many paths through the graph. It may be sufficient (since we are talking about zero-width expressions) to use all strongly connected zero-width paths that touch a rule starting node and touch a name expression for that same rule.

In any case, this should only ever be a compiler warning, since code expressions can technically avoid it, but the workbench needs to be hardened against stack overflows.

otac0n · 2014-10-12T08:14:56Z

Oh yeah, and here is possibly the most mind-bending error case:

expr -memoize
  = expr expr "foo"
  / #STATE{ state["bar"] = 1; }

otac0n added the bug label Sep 21, 2014

otac0n self-assigned this Sep 21, 2014

otac0n added this to the 3.1.1 milestone Sep 21, 2014

otac0n modified the milestones: 3.1.1, 3.1.2 Mar 1, 2015

otac0n removed this from the 3.1.2 milestone Aug 27, 2015

otac0n removed their assignment Sep 4, 2015

otac0n added this to the Unscheduled Non-breaking Change milestone Jul 25, 2016

StephenCleary mentioned this issue Apr 24, 2022

State StephenCleary/PegLeg#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Left-recursion allowed in circumstances that will always lead to a stack overflow. #54

Left-recursion allowed in circumstances that will always lead to a stack overflow. #54

otac0n commented Sep 21, 2014

otac0n commented Oct 12, 2014 •

edited

Loading

otac0n commented Oct 12, 2014 •

edited

Loading

Left-recursion allowed in circumstances that will always lead to a stack overflow. #54

Left-recursion allowed in circumstances that will always lead to a stack overflow. #54

Comments

otac0n commented Sep 21, 2014

otac0n commented Oct 12, 2014 • edited Loading

otac0n commented Oct 12, 2014 • edited Loading

otac0n commented Oct 12, 2014 •

edited

Loading

otac0n commented Oct 12, 2014 •

edited

Loading