Grammar for lookahead matching backwards when parsing a play #129
-
from arpeggio.cleanpeg import ParserPEG
grammar = r"""
grammar = line_end* scene+ EOF
scene = scene_start action+ scene_end
scene_start = "SCENE" number+ "BEGIN" line_end
scene_end= "END SCENE" line_end*
action = ( background/disappears/appears/dialogue ) line_end+
background = "BG:" caps_sentence
appears = caps_sentence &"APPEARS:"
disappears = caps_sentence &"DISAPPEARS:"
dialogue = caps_sentence ":" text+
caps_sentence = caps_word+
caps_word = caps/number/punctuation
text = punctuation/lower/caps/number
punctuation = r"[!?.,'’…^]"
number = r"[0-9]"
caps = r"[A-Z]"
lower = r"[a-z]"
line_end = "\n"
"""
if __name__ == "__main__":
kwargs = {"ws": "\t\r ", "debug": True}
input_text = """
SCENE 01 BEGIN
BG: SOME BACKGROUND
CHARACTER APPEARS:
CHARACTER: Dramatic dialogue!
CHARACTER DISAPPEARS:
END SCENE
""".strip()
ParserPEG(grammar, "grammar", **kwargs).parse(input_text) I am trying to parse a play using Arpeggio and can't seem to get the lookahead working satisfactorily. Currently I get the error Which occurs because the |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Syntactic predicates (
But, that brings another problem.
|
Beta Was this translation helpful? Give feedback.
Syntactic predicates (
&
|!
) are non-consuming. You need to consume thoseAPPEARS
andDISAPPEARS
keywords like:But, that brings another problem.
caps_sentence
will eat those keywords. So to prevent it from eatingAPPEARS
andDISAPPEARS
we use negative lookahead. The full working example is (notice the use of!keywords
incaps
: