Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add log helper for parser debugging #104

Open
sirthias opened this issue Nov 10, 2014 · 8 comments
Open

Add log helper for parser debugging #104

sirthias opened this issue Nov 10, 2014 · 8 comments
Labels

Comments

@sirthias
Copy link
Owner

run(println("marker")) works but we can do better.

Related to #105.

@lihaoyi
Copy link

lihaoyi commented Nov 17, 2014

I just spent 9 hours on an airplane debugging parboiled2 parsers (Trying to parse Scala source), here's what I came up with:

run(println("Hello Before!")) ~ (...) ~ run(println("Hello After!"))

capture(...) ~> ((x: String) => println("Hello! " + x))

Short forms for these two would be very nice. It would be great if we could define them ourselves (meta-rules?) because the way I debug things varies depending on what's going wrong.

FWIW, the formatTraces and the other formatXXX methods are pretty useless from what I've found. Most errors I'm getting have on the order of 1 to 6 hundred rules being applied, e.g. here's the last one I had

668 rules mismatched at error location:
...several dozen pages of traces...

Being able to log as well as scala-parser-combinators would be really nice. That means being able to do log(...) and getting:

  • Printouts when the rule is tried, and where e.g. the remainder of the line, or next 20 characters, something to give a hint of where the parser is
  • Printouts when the rule fails, with some kind of indication why it failed
  • Printouts when it succeeds, with the text that's captured.
  • Being able to indent the printouts, so you can see which rules are happening "inside" another rule visually, would be really nice! This isn't available by default with scala-parser-combinators, but is trivial to add because of the way you can define higher-order-parsers.

There's a lot of possible configurations you'd want to set up your logger to do, because the amount of verbosity that you have to cut through varies greatly depending on what you're doing. I don't think a one-size-fits-all solution would work

And of course using cuts would be nice to kill the useless backtrack-and-retrying and narrow the number of rules being mismatched at that location =P

@sirthias
Copy link
Owner Author

Ok, thanks!
A few of the things you want you can already do today.

Printouts when the rule is tried, and where e.g. the remainder of the line, or next 20 characters, something to give a hint of where the parser is

// print log message before trying another another rule
def log(marker: String) = rule {
  @tailrec
  def position(i: Int = 0, line: Int = 1, col: Int = 1): Position =
    if (i >= cursor) Position(index, line, col)
    else if (i >= input.length || input.charAt(i) != '\n') position(i + 1, line, col + 1)
    else position(i + 1, line + 1, 1)

  val Position(_, line, col) = position(cursor)
  println(s"$marker at pos $cursor (line $line, col $col)")
  println((input getLine line) + '\n' + (" " * (col - 1) + '^'))
}

Printouts when the rule fails, with some kind of indication why it failed

If foo is the rule to test then (foo | log("foo failed")) would print only when foo failed.
Since during debugging you usually don't care about performance you could even move to "true" meta rules like this

def logFailing[I <: HList, O <: HList](marker: String)(rule: => Rule[I, O]): Rule[I, O] =
  rule | log(marker)

If this doesn't compile yet (haven't tried) we should make it work. The parser will instantiate a function instance every time it comes across a logFailing wrapper but for debugging this should not be a problem.

The "indication why it failed" is harder to do. Ideally you'd want something like a local error reporting, that gives you all the error traces underneath the wrapped rule. That's solvable and certainly valuable.
Just added a ticket for that: #107

Printouts when it succeeds, with the text that's captured.

That's easy to write right now. Just write a rule that prints the top value stack element. (e.g. using valueStack.peek).

Being able to indent the printouts, ...

The parser runs as regular method invocations, as such you can get an indication of how deep you are in the grammar by simply looking at the current JVM stack depth.
I.e. something like this this might help:

val stackTrace = Thread.currentThread().getStackTrace
val ruleDepth = stackTrace.length - stackTrace.indexWhere(_.getMethodName contains "runRule")

@lihaoyi
Copy link

lihaoyi commented Dec 5, 2014

In general, I couldn't get anything like this

rule: => Rule[I, O]

to work in the general case; only when the things I was passing in were trivial (e.g. a single named parser). It would be great if it would work in the general case where e.g. you have ~s and |s and zeroOrMores inside

@sirthias
Copy link
Owner Author

sirthias commented Dec 5, 2014

I don't think there is a general reason why call-by-name rule passing shouldn't work.
Can you open a ticket for it?

@lihaoyi
Copy link

lihaoyi commented Dec 5, 2014

Sure lemme try getting a minimal repro

On Thu, Dec 4, 2014 at 11:49 PM, Mathias [email protected] wrote:

I don't think there is a general reason why call-by-name rule passing
shouldn't work.
Can you open a ticket for it?


Reply to this email directly or view it on GitHub
#104 (comment).

@lihaoyi
Copy link

lihaoyi commented Dec 5, 2014

Done #116

@sirthias
Copy link
Owner Author

sirthias commented Dec 5, 2014

Great! Thanks!

@thomassuckow
Copy link

For what it's worth:

@tailrec
  private def position(i: Int = 0, line: Int = 1, col: Int = 1): Position =
    if (i >= cursor) Position(i, line, col)
    else if (i >= input.length || input.charAt(i) != '\n') position(i + 1, line, col + 1)
    else position(i + 1, line + 1, 1)

  def log(marker: String) = rule {
    atomic("") ~> (() => {
      val Position(_, line, col) = position(0)
      println(s"$marker at pos $cursor (line $line, col $col)")
      println((input getLine line) + '\n' + (" " * (col - 1) + '^'));
    })
  }

"ID" ~ WS ~ Word ~ log("Foo") ~ WS ~ Word

Foo at pos 41 (line 3, col 11)
ID abababa babababa
          ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants