Much ado Predicate Variables during Hindley-Milner. #20

sgpthomas · 2023-06-29T19:54:22Z

The high-level problem is that we need to be able to preserve certain information about predicate variables during the type inference phase. For example, we want multiple methods in an object to all use the same refinement.

Series: ∀(T, K), (l:List[{'T | 'K}]) -> Series{max: () -> {'T | 'K}, data: List[{'T | 'K}]}

Here, the data variable in the Series object shares a refinement with max(). We are expressing that the share a refinement by using a forall type that introduces a type variable and then use the same type variable in multiple locations.

To see why this is a problem, consider the following trivial program:

def fut():
  tmp0 = [1, 2, 3]
  s = pd.Series(tmp0)
  tmp1 = -(1)
  return s / tmp1

We are abstracting over a base type T and a refinement variable K. When pd.Series is called, we concretize the type variables to fresh variables that will be filled in during unification.

def fut() -> pd.Series[T0, K0]:
  tmp0 = [1, 2, 3]
  s = pd.Series[T0, K0](tmp0)
  tmp1 = -(1)
  return s / tmp1

I'm not going to walk through the whole unification process. But instead just show where we derive certain information.

tmp0 has type List[int], and so we can infer that T0 == int. Then s has type Series[int, K0]. Division on Series has type Series -> int -> Series. Because of this, we infer that the result of the division has the same type as Series. Because we have already specialized K0, the final result of the function is Series[int, K0]. However, this is wrong. Because we are dividing all elements in the Series by -1, the refinement K0 is no longer valid.

A typing closer to what we want is:

def fut() -> ∀K, pd.Series[T0, K]:
  tmp0 = [1, 2, 3]
  s = pd.Series[T0, K0](tmp0)
  tmp1 = -(1)
  return s / tmp1

One way of thinking about this problem is that we specialize K0 too early. The Series type we return should still be abstracting over a predicate variable. However, it's not entirely clear to me how we should implement this.

The first thing that comes to mind involves splitting constraints underneath binders. For example the constraint

∀K, List[{int | 'K}] -> Series[int, K] == ∀K, List[int] -> Series[int]

splits into the constraints:

∀K, List[{int | 'K}] == ∀K, List[int]
∀K, Series[int, K] == ∀K, Series[int]

Notice how both of the split constraints still have the binders. This feels funny to me and it's unclear what you should do at this stage.

The text was updated successfully, but these errors were encountered:

sgpthomas · 2023-07-28T17:55:57Z

This is handled in #19. We do rewriting under binders.

sgpthomas added this to the Get `norm` working milestone Jun 29, 2023

sgpthomas added bug Something isn't working high priority labels Jun 29, 2023

sgpthomas closed this as completed Jul 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Much ado Predicate Variables during Hindley-Milner. #20

Much ado Predicate Variables during Hindley-Milner. #20

sgpthomas commented Jun 29, 2023

sgpthomas commented Jul 28, 2023

Much ado Predicate Variables during Hindley-Milner. #20

Much ado Predicate Variables during Hindley-Milner. #20

Comments

sgpthomas commented Jun 29, 2023

sgpthomas commented Jul 28, 2023