diff --git a/rfcs/unboxed-types.md b/rfcs/unboxed-types.md new file mode 100644 index 0000000..21accbe --- /dev/null +++ b/rfcs/unboxed-types.md @@ -0,0 +1,458 @@ +# Unboxed types for OCaml + +Suppose you define this type: + + type t = { x : Int32.t; y : Int32.t } + +Values of this type carry 8 bytes of actual data. However, in OCaml, +values of this type are represented as: + + - an 8-byte pointer + + - to a 24-byte block + (8B header + 2x 8B fields) + + - pointing at two more 24-byte blocks + (8B header + 8B `caml_int32_ops` + 4B data + 4B padding) + +Below is a proposed extension to OCaml to allow the layout of data +types to be controlled by the programmer. This proposal won't affect +the type defined above, but will allow you to define more compact +layouts. There are two main aspects. The first part, "unboxed types", +lets you write the following: + + type t = { x : #int32; y : #int32 } + +Here, the x and y fields have the unboxed type `#int32`, which +consumes only 4 bytes. Values of type `t` are still boxed: they are +represented as 8-byte pointers to a 16-byte block, consisting of an +8-byte header and two 4-byte fields. Despite containing unboxed +fields, the type t is an ordinary OCaml type: it can be passed to +normal functions, stored in normal data structures, and so on. + +The second part, "unboxed records", additionally lets you write: + + type u = #{ x : #int32; y : #int32 } + +This is an unboxed record of unboxed fields. Values of this type are +represented as a pair of 32-bit integers. No allocations are made when +creating values of this type. If one of these is passed to a function, +it is passed in two registers. + +The price is that unboxed records are more awkward to use: you cannot +have, for example, a `u list`. + +## Layouts + +The central issue with adding unboxed types to OCaml is that +operations that used to work for all possible types don't necessarily +work for unboxed types. Here's an example: + + let diag x = (x, x) + +This function is polymorphic, having type: + + val diag : 'a -> 'a * 'a + +However, if you try to use it with `'a = #int32`, then you end up +creating a tuple containing unboxed 32-bit integers where the garbage +collector expected tagged or boxed values. Segfaults ensue. + +So, unboxed types mean that polymorphic definitions need to be +polymorphic over only a subset of the possible types (say, the boxed +ones). We do this by introducing *layouts*, which classify types in +the way that types classify values. (The technical term for such +things is *kinds*, but we're avoiding that word because (a) it's +unfamiliar to programmers who aren't type theorists and (b) it doesn't +describe what we're using kinds for). The layout of a type determines +how many bytes of memory values of that type consume, how many and +what kind of registers are needed to pass such values to functions, +and whether locations holding such values need to be registered with +the GC. + +Once we have layouts, the type `'a -> 'a * 'a` is a short form of: + + val diag : ('a : value) . 'a -> 'a * 'a + +The layout `value` is the layout of ordinary OCaml values (boxed or +immediate). This means that trying to use diag at type `#int32 -> +#(int32 * int32)` causes a compilation error, as `#int32` is not of +layout `value`. + +The available layouts include: + + value + immediate + int32 + int64 + float + any_layout + +As well as the above primitive layouts, layouts may be combined using +the `*` operator. Unlike types, layouts are 'flat'. That is, the +following are all equivalent: + + (value * value) * value + value * (value * value) + value * value * value + +There is a subtyping relation between these layouts, where `immediate +<= value`, and `L <= any_layout` for all layouts `L`. This means that +all types of layout `immediate` also have layout `value`, and every +type has layout `any_layout`. (The subtyping relationship between +product layouts is pointwise, and only between products of the same +length). + + +## Unboxed types and records with unboxed fields + +All of the currently-existing types in OCaml have layout `value`. Some +also have layout `immediate`, such as `int` or `unit`. We're planning +to add unboxed versions of a few builtin types, such as `#int32` and +`#float`. This gives: + + type float : value + type int32 : value + type int : immediate + type #float : float + type #int32 : int32 + +All of these can be passed and returned from functions, but only the +`value` types (including `immediate` ones) can be stored in, say, +lists. Note that the existing `+.`, etc. primitives won't work on +`#float`, because they have the wrong type. New ones for `#float` will +be made available. (Same for `#int32`, etc.) + +Unboxed types will be allowed in record fields: + + type t = { + foo : int; + bar : #float; + ... + } + +Because of how the OCaml runtime works, there are some restrictions on +how boxed and unboxed types can be mixed in a record. If there are any +fields which might contain pointers (that is, are of layout `value` +but are not `immediate`), then all fields must be of layout +`value`. So, you can mix `int` and `#float` in a single record, but +not `string` and `#float`. + +## Unboxed tuples and records + +The second part of this proposal is to allow user-defined unboxed +types (by contrast to what's above: user-defined boxed types +containing unboxed fields). The simplest of these is unboxed tuples: + + #( int * float * string ) + +This is an unboxed tuple, and is passed around as three separate +values. Passing it to a function requires three registers. +The layout of an unboxed tuple is the product of the layout of its +components. In particular, the layout of `#( string * string )` is +`value * value`, not `value`, and so it cannot be stored in, say, a +list. + +Unboxed records are in theory just as straightforward, but have some +extra complexity just to handle the various ways that types are +defined and used. Suppose you write: + + type t = { x : int32; y : int32 } + +Sometimes we want to refer to `t` as a boxed type, to store it in +lists and other standard data structures. Sometimes we want to refer +to `t` as an unboxed type, to avoid extra allocations (in particular, +to inline it into other records). The language never infers which is +which (boxing and allocation is always explicit in this design), so we +need different syntax for both. So, `type t = { ... }` defines *two* +types, the boxed record `t : value` and the unboxed record `#t : +int32 * int32`. They have different syntax for introduction and +elimination: + + let p : t = { x = a; y = b } + let px = p.x + let q : #t = #{ x = a; y = b } + let qx = q.#x + +An unboxed record like `q` cannot itself contain mutable +fields. However, a boxed record can contain a mutable field of type +`q`. This can be used to inline one record inside another, e.g.: + + type s = { foo : int; mutable bar : #t } + +This defines a boxed record, layed out as three fields. We will +support efficient updates of sub-parts of `bar`, at least with the +following syntax: + + x.bar <- { x.bar with y = y' } + +It would be nice to also allow the syntax `x.bar.#y <- y'`, although +this is a little trickier to understand and implement, since it is +`x.bar` rather than `x.bar.y` which is the mutable field. + +Type aliases do not automatically bind an unboxed variant, because +the scoping of type aliases should not depend on whether they expand +to records or not. So, if you write: + + type u = t + +then you don't get `#u`. You can get around this by re-exporting the +definition: + + type u = t = { x : int32; y : int32 } + +which gets you both `u = t` and `#u = #t`. You can also define an +alias directly to the unboxed type: + + type u = #t + +Similarly, an abstract type abstracts only the boxed type: + + module M : sig + type t + end = struct + type t = { x : int32; y : int32 } + end + +defines `M.t` but not `M.#t`. (It may be useful to allow exporting +`M.#t` despite `t` being abstract, perhaps with syntax like `type t : +value with #t : int32 * int32`). + + +## Polymorphism, abstraction and type inference + +The major change to the internals of the type system is that type +variables need to be annotated with their layout. This affects all +uses of type variables, both in type definitions and in value +descriptions. The syntax is extended to allow these to be given +explicitly, e.g.: + + type ('a : value, 'b : immediate) t = Foo of 'a * 'b + val fn : ('a : immediate) . 'a ref -> 'a -> unit + +In signatures, type variables that are not bound with either of the +above syntaxes are given layout `value`. + +Layout annotations also appear on abstract types: + + type ('a : value, 'b : int32) t : immediate + +Again, if this annotation is omitted it's assumed to be `value`. + +It may seem a little weird here to allow abstraction over types of +layout `int32`. After all, there's only one such type (`#int32`). +However, this ability becomes valuable when combined with abstract +types: two different modules can expose two different abstract types +`M.id`, `N.id`, both representing opaque 32-bit identifiers. By +exposing them as abstract types of layout `#int32`, these modules can +advertise the fact that the types are represented in four bytes +(allowing clients of these modules to store them in a memory-efficient +way), yet still prevent clients from mixing up IDs from different +modules or doing arithmetic on IDs. + +### Unification + +As above, type variables are annotated with the layout of the types +over which they range. During unification, newly-created type +variables initially range over `any_layout`. Two interesting things +can then happen: unification of two variables or unification of a +variable and a non-variable. (Unification of two non-variables works +as before). + +**Two variables** If we have two type variables `'a : L`, `'b : K`, +then their unifier is a type variable that ranges over only types that +could be assigned to both `'a` and `'b'`. Its layout is therefore the +intersection of the layouts `L` and `K`, that is, the layout of types +that have both layouts. This can be done in general, as layouts form a +partial semilattice: either there is some layout which is exactly the +intersection of `L` and `K`, or else the two layouts have no types in +common (causing a unification error). + +**One variable, one non-variable** To unify `'a : L` with a +non-variable type `t`, we must check first that `t : L`. The +interesting case here is when `t` is of the form `(p, q) r`, for some +`r` in the type environment. + +There is a subtlety here: the layout of `(p, q) r` can in general depend on +the layouts of `p` and `q`. For instance: + + type ('a : value) id = 'a + +Since `'a : value`, we know that `'a id` is always of layout `value`, +but if `'a = int` it may also have the more specific layout +`immediate`. + +For each type constructor `(_, _) r` in the environment, we store +along with it a layout `K` which is an upper bound for the layout of +any instance of `r`. (For `'a id` above, this is `value`). When +checking `(p, q) r : L`, we compare `L` and `K`. If `K` is a subtype +of `L`, then the check passes. However, if the check fails and `r` is +an alias, we must expand `r` and try again. We only report a +unification failure once the check fails on a type which is not an +alias. + +For example, suppose we have the unification variables `'a : value` +and `'b : immediate`, and we're trying to unify `'a id = 'b`. We need +to check `'a id : immediate`, but the environment tells us that the +layout of `_ id` is `value`. So, we expand `'a id` to `'a` and retry +unification, which succeeds after unifying `'a` and `'b` (the +resulting type variable has layout `immediate`). The effect here is +that after unifying `'a id` with `'b`, we leave `'a` as an +uninstantiated unification variable, but its layout is restricted to +`immediate`, to match `'b`. + +### Type well-formedness + +Thanks to how types are constructed in OCaml, we need no separate +well-formedness check. Suppose we're given the following type: + + type ('a : immediate) t = ... + +and the user attempts to write `string t`. The syntax `string t` is +converted to a type expression by first constructing a template with a +fresh unification variable `'b t`, and then unifying `'b` with +`string`. Here, since `'b` ranges over layout `immediate`, the +unification with `string` will fail. + +### Generalisation + +In OCaml, type variables which are not constrained are *generalised* +at the nearest enclosing `let`, and can be used polymorphically from +then on. Done naively, this means that the function: + + let diag x = (x, x) + +would have the generalised type: + + val diag : ('a : any_layout) . 'a -> 'a * 'a + +In many ways, this type makes sense: this definition of `diag` does +really work for any type of any layout. But a polymorphic function of +this type is not compilable, as we cannot allocate storage for a type of +`any_layout`: we don't know how many registers it needs, how many +bytes of memory it should take when stored, or whether it needs to be +registered with the GC! + +So, when generalising a type variable, if its layout is `any_layout` +then we change the layout to `value` instead, giving `diag` the +compilable type: + + val diag : ('a : value) . 'a -> 'a * 'a + +We retain principal *types*, but we don't have principal *layouts*: +for any expression, there's a best type for any given layout, but +there's no best layout (or at least, the best layout isn't always +compilable). + +We only need to do this when the polymorphic type variable is used as +an input to the function (i.e. in a contravariant position). If the +polymorphic type variable is only used positively, then the function +can never manipulate any values of this type, and remains compilable +(similar reasoning justifies the relaxed value restriction). For +instance, for the function `raise`, we can actually compile the +following type: + + val raise : ('a : any_layout) . exn -> 'a + +The function's type only mentions `'a` in positive positions, so it +never manipulates any values of type `'a`. This is useful: it means +that functions like `raise` that never return can be used where +e.g. an `#int32` is expected. + +## Beyond records + +As well as the basic unboxed types and unboxed records, we'd like to +have some subset of the following. + +**Variants** + +So far, we've only discussed records, when unboxed variants are +equally useful. The general case is quite tricky, but several special +cases are easy and useful: variants containing only constant +constructors, or variants whose non-constant constructors have the +same types. + +The tricky part of the general case is that the layout of a variant +(including number of fields and whether those fields need to be +registered with the GC) varies according to the tag. One simple way +around this difficulty is to always allocate enough space for all +fields: this might make sense for small variants like `result`, where +the savings from unboxing outweigh the extra word used to represent them. + +**Nullable types** + +Unboxing the `'a option` variant would improve efficiency, but an +unboxed `t option` still requires two words: one to store the `t` (if +`Some`), and one to store the tag (`Some` or `None`). Picking a +special magic value for `None` isn't enough to cram options into one +word, since we need to be able to distinguish `None` from `Some None`, +`Some (Some None)`, etc. + +In many cases, we know statically that the argument to option cannot +be another option (e.g. `string option`, `int option`, `float +option`...). In these cases, we can use *nullable types* instead. In a +nullable type, the type parameter `'a` has layout `non_nullable`, +which is a subtype of `value`. The `Some` constructor for nullable +types is the identity (and in particular, does not allocate), while +the `None` constructor is a constant null value. + +Since the equivalent of `t option option` gives a type error, no +ambiguity is introduced by implementing `Some` as the identity. + +**Padding** + +As well as being able to control boxing of types, it's sometimes also +useful to control the bytes of padding between unboxed elements. (This +is sometimes useful for optimisation, but the main application is +being able to directly map externally-specified formats like protocols +or C structs). + +**Bitpacking** + +Related to byte-level padding, but quite different in implementation, +bitpacking allows multiple unboxed fields (say, boolean flags) to +occupy the same byte. In particular, a record containing <32 booleans +could be given an `int32` layout rather than a product layout (and +thus be passed around in just a single register). + +**GADTs** + +As well as refining types, GADTs should be able to refine layouts as +well. It would be nice to be able to define the following: + + type _ is_immediate = + | Immediate : ('a : immediate) -> 'a is_immediate + | Other : ('a : value) -> 'a is_immediate + + +## Future projects + +**Monomorphisation** + +"Uncompilable" types like `diag : ('a : any_value) . 'a -> 'a * 'a` +can actually be compiled, by compiling a separate version of `diag` +for each layout. In fact, we need fewer layouts than that: we can +share the same version for `value` and `immediate`. Generally, +we need one version per 'sort' (roughly, register width). Within a +sort, there may be multiple layouts with subtyping, but there is no +subtyping between sorts. Then, we could type the general `diag` as: + + val diag : ('s : sort) ('l : 's layout) ('a : 'l) . 'a -> 'a * 'a + +This means that typechecking needs to support unknown sorts and +unification of sort variables. This shouldn't be too hard: since +there's no subtyping between sorts, standard unification works fine. + +As well as allowing definitions that get monomorphised, this approach +can also support "weakly layout-polymorphic" definitions. For +instance, we can compile `diag` at only a single sort, but use +unification sort variables to delay the choice of that sort until we +see how it's used, rather than defaulting to the sort `value`. + +**Macros** + +The most interesting use of a sort system is to support, rather than +automatic monomorphisation, manual instantiation of sorts and layouts +using macros, to allow well-typed well-laid-out code generation at +compile time. This sounds all sorts of fun, but it's a different +project entirely.