ScopedReferences

Scoped references

In many "low-level" languages, pointers are central. But, pointers are known to be highly unsafe; null pointers can cause crashes when dereferenced and pointers to invalid or freed memory can crash or cause security vulnerabilities. The concept of "scoped" references refers to the idea of references which are guaranteed by the type system to always point to valid memory. This is often referred to as "borrowchecking", and the act of creating a reference from a value or binding as "borrowing" or the "creating a borrow". Since references can be immutable or mutable, borrowing and borrow creation are often qualified as "mutably" or "immutably borrowing", or creating a mutable or immutable borrow.

Borrowchecking and dependency graphs

One way to look at borrows is as a requirement over the graph of dependencies in a program. Consider a program like so:

{
    let x = 0;
    {
        let y = &x;
        *y
    }
}

In this program, each binding will be given its own "region" or "lifetime" - in Rust, for example, this represents two things: what scope some value is valid in, and also a rough approximation of where it lives in memory. In this case, we will denote the regions in which x and y live as 'x and 'y. In this block, y depends on 'x; if we consider a graph of this dependency, we can tell that at the end of the y block that x is borrowed because x lives inside the region 'x, and there exists an edge in the dependency graph from 'x to 'y. A variable is only unborrowed if there are no outgoing edges from its node in the dependency graph.

Note that in this scheme, only values stored in a binding may be borrowed. It may also be noted that borrowchecking only succeeds here if the dependency graph is acyclic - this ignores additional complications caused by mutable borrows.

Fractional permissions

The fractional permissions approach to borrowing relies on linearity. It also effectively counts the number of references to a given region at any time. There are two fundamental primitives to fractional permissions:

split : &[n] 'a T -> (&[n/2] 'a T, &[n/2] 'a T)
merge : (&[n/2] 'a T, &[n/2] 'a T) -> &[n] 'a T

In this case, 'a is used to denote a region. How regions are created is orthogonal to the fractional permissions system; either all values can be associated with a region, or regions can be associated with variables.

References, under this scheme, are fully linear - they must be used exactly once in any context they exist in. This means that if a function is given a reference as an argument, the reference has to also be returned from the function, or merged with another reference which is then returned from the function, or split, or - if the reference is the only remaining reference to a given region - then the reference can be destroyed, under the non-lexical scheme, to reproduce the original value. The difference between the lexical and non-lexical schemes are in the primitives used to create references:

Non-lexical

borrow : T -> &'a [1] T where 'a is fresh
unborrow : &'a [1] T -> T

Lexical

borrow : T -> (forall 'a. &'a [1] T -> (&'a [1] T, U)) -> U

Note that in either case, only a reference with a fraction of 1 may be consumed to reproduce the original value, and that the original value - thanks to linearity, assuming that it is linear - is never available when borrowed.

NOTE If mutable references are desired, it is reasonable to allow a reference with a fraction of 1 to denote a mutable reference - although under the non-lexical scheme, this is somewhat moot.

Lifetime variable inspection/rank-2 encoding

It is possible to encode lifetimes using rank-2 types. This approach is based off of Haskell’s ST monad. The type of runST is given as

runST :: (forall s. ST s a) -> a

And, after careful consideration, it must be noted that any STRef created inside this ST computation - since instances of STRef are created with the phantom type variable s attached - cannot exist inside whatever type a is inferred to be. This is because if a contains s, then s must either be concrete or quantified outside of the inner forall; but the computation in the inner forall must be valid for all s, irrelevant of a! Therefore, a cannot contain s, and no STRef created inside can escape, because they will without exception be parameterized with s.

A system encoding references via rank-2 types might look something like this:

data Ref s a = ...

borrow :: (forall s. Ref s a -> b) -> a -> (a, b)

The a has to be re-returned due to potential linearity.

Without using rank-2 types, this method boils down to making borrow-scopes explicit and then requiring that lifetime variables created inside the scope are not allowed to escape. That is, somewhere in the type system, whether as an axiom in the inference rules or as a primitive function, is a construct like this:

borrow : (&'a T -> U) -> T -> U where T and U do not contain 'a.

Partially ordered type systems

Ordered type systems require that variables and values be used in the order they are declared. They do so by removing the exchange rule entirely from the inference judgements for that given type system. Partially ordered type systems, then, we define as type systems with a conditional exchange rule - some bindings may be exchanged, but not all.

Partially ordered type systems can be used to enforce reference rules by ordering references w.r.t. the values they borrow, which mandates that all references created from borrowed values must be consumed before the values they borrow, and thus the borrows must end before the consumed values can be used again. This system has a few oddities which have yet to be worked out.

Provide feedback

Saved searches