Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: interpreter (faster evaluator-to-normal-form) #1187

Merged
merged 7 commits into from
Dec 5, 2023
Merged

Conversation

brprice
Copy link
Contributor

@brprice brprice commented Nov 29, 2023

The one feature of the old stepwise evaluator that we don't replicate is the ability not to evaluate under lambdas.
This may be annoying if this new interpreter is hooked up to a frontend, as then any evaluation of a recursive function will diverge, even if an evaluation of that same function applied to an argument may succeed. The possible mitigations for this include

  • only use this new interpreter for things of "base" type, i.e. things which have a type such that we expect them not to contain un-applied lambdas (note that if they contain holes, this expectation may be false, so a refinement would be "base type and not contain holes"). This is not quite "not an arrow/forall type", as we need to detect things like Maybe (Bool -> Char), or a return type of T where data T = C (Bool -> Char).
  • revisit this new interpreter to avoid the hard requirement that it evaluates under lambdas. This can be done by stepping outside of the AST domain and representing the interpretation of a lambda as a closure (a pair of the current environment and the body), and then applications will require the function-position to be a closure rather than a syntactic lambda. (This may also be useful if we want to make Expr spine-strict, since we need to step outside of the AST domain to keep laziness in that case anyway.)

When working on this PR, at one point we had an explicit
"fuel" counter so that the interpreter would not loop on a divergent term.
This has been replaced (and rebased) with an IO-based System.Timeout call,
but the old state is still in the history for https://github.com/hackworthltd/primer/tree/brprice/wip/interp.

The reason we changed track here is

  • We need to lazily produce the resultant AST, since the semantics of
    Primer (the object language) are lazy
    (in particular, code like head (letrec l = Cons True l in l) should
    reduce to True (ignoring type annotations)
  • This means we can't use ExceptT (i.e. Either) to signal the
    "recursion depth exceeded" error, since to pattern match on the (root
    of the) AST of a recursive call would force the Either, and thus
    divergent subterms would throw a Left, aborting the whole computation.
  • We can however throw an "imprecise exception" (e.g. Prelude's error,
    or Control.Exceptions throw), from pure code. This is essentially
    hiding an exception deep within a (recursively returned) AST, and will
    not actually trigger unless it is forced.
  • However, to make this testable, we need to be able to catch the
    "timeout" error in the testsuite, meaning we need to be able to run IO
    actions in Hedgehog tests, and thus need to refactor our WT monad to
    (optionally) include IO at the base.
  • Having done this, it is just as easy to not have the recursion-depth
    check and use System.Timeout instead, which also works in IO,
    and avoids some overhead due to not tracking the depth (and makes the
    interpretation of recursive let bindings nicer.

Note that (c.f. comment in code) this new interpreter relies heavily on laziness and will break if Expr is made spine-strict.

@brprice brprice force-pushed the brprice/interp branch 4 times, most recently from 5dff54f to abc3475 Compare November 30, 2023 15:14
@brprice
Copy link
Contributor Author

brprice commented Nov 30, 2023

@dhess: I'll leave some cleanup work for you to look at:

  • testsuite should be DRY'd between Interpreter and FullStep (currently I've copy-pasted the file and hacked the tests, removing some that are not super relevant).
  • enabling the last two tests about handleEvalFullRequest, as that requires essentially the same work as hooking up the interpreter to the OpenAPI.

@brprice brprice force-pushed the brprice/interp branch 2 times, most recently from 9e49321 to 9cfbd77 Compare November 30, 2023 15:19
Comment on lines +275 to +282
Ann _ (PrimCon _ c) (TCon _ ((== primConName c) -> True))
| Just (CaseBranch _ [] t) <- find ((PatPrim c ==) . caseBranchName) brs -> interp' tydefs env Chk t
| CaseFallback t <- fb -> interp' tydefs env Chk t
| otherwise -> throw $ NoBranch (Right c) $ caseBranchName <$> brs
-- literals (primitive constructors) are actually synthesisable, so may come
-- without annotations
PrimCon _ c
| Just (CaseBranch _ [] t) <- find ((PatPrim c ==) . caseBranchName) brs -> interp' tydefs env Chk t
| CaseFallback t <- fb -> interp' tydefs env Chk t
| otherwise -> throw $ NoBranch (Right c) $ caseBranchName <$> brs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "duplication" is a bit of a worry. It is reflecting the same in the stepwise evaluator, but we should perhaps review the calculus to see if it is really needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1237

@brprice brprice changed the title WIP interpreter feat: interpreter (faster evaluator-to-normal-form) Nov 30, 2023
@brprice brprice marked this pull request as ready for review November 30, 2023 17:29
We sometimes evaluate a `case` and sometimes an `app`. These should be
evaluated with different directions. This was noticed when implementing a
more efficient interpreter which gave a different output when evaluating
the checkable `case` in a synthesis context.

Signed-off-by: Ben Price <[email protected]>
Copy link
Contributor

@georgefst georgefst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments. No code changes.

- perhaps do full NBE
- maybe don't stay in AST domain
this perhaps may help with a pure recursion-depth limit and reporting partial results?
-}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should expand on this, maybe in a GitHub issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1194

@brprice
Copy link
Contributor Author

brprice commented Nov 30, 2023

I think this is more-or-less ready, except the testsuite run seems to fail on linux CI (OOM??). I'll need to investigate further.

Some initial investigation shows that running the CI testsuite (nix run -v -L -j2 .#'primer:test:primer-test' -- -j1) artifact with -j1 is fairly static at 700M resident memory, but this branch seems to quickly grow to multiple tens of gigabytes (even when running old tests unrelated to the interpreter). I am confused about this and will look again in the near future.

@dhess
Copy link
Member

dhess commented Nov 30, 2023

(I think the last OOM was due to a long-running earlier test that was chewing up lots of resident RAM.)

@brprice brprice added the Do not merge Do not merge label Nov 30, 2023
@brprice
Copy link
Contributor Author

brprice commented Dec 5, 2023

I think this is more-or-less ready, except the testsuite run seems to fail on linux CI (OOM??). I'll need to investigate further.

Some initial investigation shows that running the CI testsuite (nix run -v -L -j2 .#'primer:test:primer-test' -- -j1) artifact with -j1 is fairly static at 700M resident memory, but this branch seems to quickly grow to multiple tens of gigabytes (even when running old tests unrelated to the interpreter). I am confused about this and will look again in the near future.

This OOM is due to 14096bf refactor: generalise WT to be a transformer.

(A useful technique here is to run primer-test with ghc rts options +RTS -M1G to limit the heap size. This PR will abort with an exhausted heap, and main will not. git bisect run can then automate testing each commit)

@dhess
Copy link
Member

dhess commented Dec 5, 2023

WT is only used for testing, correct?

@brprice
Copy link
Contributor Author

brprice commented Dec 5, 2023

WT is only used for testing, correct?

Yes. It is used in property tests only

This is needed to be able to run `IO` actions in `PropertyWT`, which
will itself be needed when we implement a more efficient evaluator
(due to using `System.Timeout` to avoid infinite loops).

Another choice would be to generalise `WT` to be a transformer
also. However, we only actually use the (hypothetical) `WT IO` and
not any other `WT m`. Since testing showed that the extra polymorphism
caused GHC to emit much less (memory) efficient code, we will use the
monomorphic version. (I expect that sufficient use of specialization
pragmas would fix the memory leaks for a transformer version of `WT`,
but I have not verified this.)

Signed-off-by: Ben Price <[email protected]>
There is no need for this to care about the metadata's type.

Signed-off-by: Ben Price <[email protected]>
We will shortly implement a more efficient "full evaluator", and wish
to disambiguate some naming.

Signed-off-by: Ben Price <[email protected]>
This is to enable reuse in a subsequent commit.

Signed-off-by: Ben Price <[email protected]>
This does not iterate the single-step redex evaluator, but instead uses
Haskell's runtime to directly interpret primer ASTs.

Signed-off-by: Ben Price <[email protected]>
This adds code to check for alpha equality of terms

Signed-off-by: Ben Price <[email protected]>
@brprice
Copy link
Contributor Author

brprice commented Dec 5, 2023

I think this is more-or-less ready, except the testsuite run seems to fail on linux CI (OOM??). I'll need to investigate further.
Some initial investigation shows that running the CI testsuite (nix run -v -L -j2 .#'primer:test:primer-test' -- -j1) artifact with -j1 is fairly static at 700M resident memory, but this branch seems to quickly grow to multiple tens of gigabytes (even when running old tests unrelated to the interpreter). I am confused about this and will look again in the near future.

This OOM is due to 14096bf refactor: generalise WT to be a transformer.

(A useful technique here is to run primer-test with ghc rts options +RTS -M1G to limit the heap size. This PR will abort with an exhausted heap, and main will not. git bisect run can then automate testing each commit)

I have force-pushed a fix for this. Essentially we don't make WT a transformer, but just change the base monad to IO. This does still need TestM to be transformer-ified. The problem seems to be GHC emitting less-efficient code for WT being a transformer, even though we only actually use it at IO -- this could probably be fixed with the correct pragmas, but that seems somewhat pointless.

I think this PR is ready to merge now.

@dhess: note that there are two TODOs left in the code that I will leave for you in future work. They are

  • DRY the testsuites for interp and the old stepping full eval (as much as makes sense, at least)
  • enable two more tests for interp on a prog with imported modules (needs interp hooked up to the core api layer)

@brprice brprice removed the Do not merge Do not merge label Dec 5, 2023
@dhess dhess added this pull request to the merge queue Dec 5, 2023
@dhess
Copy link
Member

dhess commented Dec 5, 2023

Thanks! The speedup here will make a huge impact.

Merged via the queue into main with commit 6b8368f Dec 5, 2023
@dhess dhess deleted the brprice/interp branch December 5, 2023 20:55
@dhess
Copy link
Member

dhess commented Apr 17, 2024

@dhess: note that there are two TODOs left in the code that I will leave for you in future work. They are

  • DRY the testsuites for interp and the old stepping full eval (as much as makes sense, at least)
  • enable two more tests for interp on a prog with imported modules (needs interp hooked up to the core api layer)

Note that the first of these was (at least partially) addressed in #1238, and the second is addressed in #1241.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants