PoC: Generics and better type inference #58
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a proof-of-concept implementing generics and slightly better type inference. It's based on the paper Complete and Easy Bidirectional Typechecking for Higher-Rank Polymorphism, although I had to make some modifications because their type system is pretty different from ours. I'm not sure yet if the new type inference algorithm will never infer the wrong type, and there's missing pieces here and there. But we should be able to make this general approach to type checking/inference work.
This is what it looks like:
I didn't do subtype/supertype bounds on type parameters, but it shouldn't be terribly hard to implement with the current design.
Other things included in this PR:
any
andoneof
Things worsened by this PR:
++
requires both operands to be lists, to match how Nushell now workslet x: list<list<int>> = [["foo"]]
, you'll get 3 error messages instead of just one (one saying"foo"
should be an int, one for["foo"]
, and one for[["foo"]]
).Things left to do:
Apologies for the big PR. I should probably have done some of these things before implementing generics. That said, it's just a proof-of-concept, so it doesn't need to be reviewed thoroughly.
How type checking/inference work
The
typecheck_expr
method takes the expected type of the expression it is currently processing and provides an inferred type for every expression it visits. This way, we both check against the expected type and infer a type for the expression at the same time.For generics to work, the algorithm requires creating and solving type variables. These type variables have a lower bound and an upper bound. As we move through the program, these bounds are tightened further. At the end of the program, the lower bound of each type variable is chosen as its value.
Every time we come across a call to a custom command with type parameters, we instantiate new type variables corresponding to those type parameters. For example, for the expression
f 123
above, we instantiate a new type variable'0
with lower boundbottom
and upper boundtop
(essentially unbounded). Because of the signature off
, we know that123: '0
andf 123: list<'0>
.So we first call
typecheck_expr
to check/infer the type of123
, providing'0
as its expected type. Since it's just an integer literal, we infer the type to beint
. Then, to ensure that this type matches the expected type ('0
), we call theconstrain_subtype
method to ensure thatint <: '0
. The existing lower bound for'0
wasbottom
, so we set the new lower bound tooneof<bottom, int> = int
.Then we set the type of
f 123
tolist<'0>
. After all expressions have been processed, we replace all occurrences of'0
withint
. So this becomeslist<int>
.todo write about how recursive bounds are avoided