diff --git a/content/courses/hoon-school/Q2-parsing.md b/content/courses/hoon-school/Q2-parsing.md index 9d31c9a..e4e8a66 100644 --- a/content/courses/hoon-school/Q2-parsing.md +++ b/content/courses/hoon-school/Q2-parsing.md @@ -39,9 +39,9 @@ we have to wade directly into a sea of new types and processes. To wit: - A {% tooltip label="tape" href="/glossary/tape" /%} is the string to be parsed. - A `hair` is the position in the text the parser is at, as a cell of - column & line, `[p=@ud q=@ud]`. + line & column, `[p=@ud q=@ud]`. - A `nail` is parser input, a cell of `hair` and `tape`. -- An `edge` is parser output, a cell of `hair` and a `unit` of `hair` +- An `edge` is parser output, a pair of a `hair` and a unit containing a pair of the result and a `nail` and `nail`. (There are some subtleties around failure-to-parse here that we'll defer a moment.) - A `rule` is a parser, a gate which applies a `nail` to yield an diff --git a/content/language/hoon/guides/parsing.md b/content/language/hoon/guides/parsing.md index e956132..0066713 100644 --- a/content/language/hoon/guides/parsing.md +++ b/content/language/hoon/guides/parsing.md @@ -100,7 +100,7 @@ A `hair` is a pair of `@ud` used to keep track of what has already been parsed for stack tracing purposes. This allows the parser to reveal where the problem is in case it hits something unexpected during parsing. -`p` represents the column and `q` represents the line. +`p` represents the line and `q` represents the column. ### `nail` @@ -134,9 +134,9 @@ of the original input `tape `up to which the text has been parsed. If parsing failed, `p` will be the first `hair` at which parsing failed. `q` may be `~`, indicating that parsing has failed . -If parsing did not fail, `p.q` is the data structure that is the result of the -parse up to this point, while `q.q` is the `nail` which contains the remainder -of what is to be parsed. If `q` is not null, `p` and `p.q.q` are identical. +If parsing did not fail, `p.u.q` is the data structure that is the result of the +parse up to this point, while `q.u.q` is the `nail` which contains the remainder +of what is to be parsed. If `q` is not null, `p` and `p.q.u.q` are identical. ### `rule` @@ -171,9 +171,9 @@ of the input `nail`. We note that `p.edg` is `[p=1 q=2]`, indicating that the next character to be parsed is in line 1, column 2. `q.edg` is not null, indicating that parsing -succeeded. `p.q.edg` is `'a'`, which is the result of the parse. `p.q.q.edg` is the same as `p.edg`, which is always the case for +succeeded. `p.u.q.edg` is `'a'`, which is the result of the parse. `p.q.u.q.edg` is the same as `p.edg`, which is always the case for `rule`s built using standard library functions when parsing succeeds. Lastly, -`q.q.edg` is `"bc"`, which is the part of the input `tape` that has yet to be parsed. +`q.q.u.q.edg` is `"bc"`, which is the part of the input `tape` that has yet to be parsed. Now let's see what happens when parsing fails. @@ -208,7 +208,7 @@ Let's see what happens when we successfully parse the entire input `tape`. line 1, column 4. Of course, this does not exist since the input `tape` was only 3 characters long, so this actually indicates that the entire `tape` has been successfully parsed (since the `hair` does not advance in the case of failure). -`p.q.edg` is `'abc'`, as expected. `q.q.edg` is `""`, indicating that nothing +`p.u.q.edg` is `'abc'`, as expected. `q.q.u.q.edg` is `""`, indicating that nothing remains to be parsed. What happens if we only match some of the input `tape`? @@ -219,10 +219,10 @@ What happens if we only match some of the input `tape`? [p=[p=1 q=3] q=[~ [p='ab' q=[p=[p=1 q=3] q="c"]]]] ``` -Now we have that the result, `p.q.edg`, is `'ab'`, while the remainder `q.q.q.edg` +Now we have that the result, `p.u.q.edg`, is `'ab'`, while the remainder `q.q.u.q.edg` is `"c"`. So `+jest` has successfully parsed the first two characters, while the last character remains. Furthermore, we still have the information that the -remaining character was in line 1 column 3 from `p.edg` and `p.q.q.edg`. +remaining character was in line 1 column 3 from `p.edg` and `p.q.u.q.edg`. What happens when `+jest` fails?