Masala-Parser (like Parsec) is a top-down parser and doesn't like Left Recursion. Furthemore, building naively your combinator parser with recursion would make a stack overflow before parsing
const {Streams, F, N, C, X} = require('@masala/parser');
function A(){
return C.char('A').then(B());
}
function B(){
return C.char('B').or(A()); // <----- will call A() then B() then A()
}
console.log('=== Building Parser ====');
const parser = A();
console.log('=== NEVER THERE ====');
let parsing = parser.parse(Streams.ofString('AAAAAB'));
Masala-Parser Comes with a F.lazy( P )
function that will end the loop while building. It builds the combinator, but P will
be called only while parsing the stream.
function A(){
return C.char('A').then(B());
}
function B(){
return C.char('B').or(F.lazy(A)); <--- A() is not called yet !
}
console.log('=== Building Parser ====');
const parser = A();
console.log('=== GETTING THERE ====');
let parsing = parser.parse(Streams.ofString('AAAAAB')); // Accepted :)
Operation are logically left-recursive: Let say an expression is 100
or (20*5)
, which is also ((4*5)*5)
,
or (((2*2)*5)*5)
....
There is a general algorithm that removes Left recursion.
It can be written it his PEG form:
E -> T E'
E' -> + TE' | eps
T -> F T'
T' -> * FT' | eps
F -> NUMBER | ID | ( E )
Where:
- E is an Expression
- F is a terminal expression (Final), or a start for recursion
- E' and T' are optional operation made on F
- eps is an empty character
Note that there are two distinct level of precedence, as +
and *
doesn't have the same level of priority.
Using only one level of operator, we simplify to:
T -> F T'
T' -> operator FT' | eps
F -> NUMBER | ( T )
Which can be translated in pseudo masala:
expr -> terminal then subExpr
subExpr -> (operator then terminal then subExpr ).opt()
terminal -> NUMBER or ( F.lazy(expr) )
It appears that it's simplified with Masala, as you won't have to create a subExpression per operator. Just call
const operator = ()=> C.charIn('+-*/');