Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed type inference should not pollute type bounds #226

Open
apoelstra opened this issue Jun 30, 2024 · 0 comments
Open

Failed type inference should not pollute type bounds #226

apoelstra opened this issue Jun 30, 2024 · 0 comments

Comments

@apoelstra
Copy link
Collaborator

Consider the following unit test

    #[test]
    fn inference_failure_type_pollution() {
        let unit1 = Arc::<ConstructNode<Core>>::unit(); // A -> 1a
        let unit2 = Arc::<ConstructNode<Core>>::unit(); // B -> 1b
        // Constructing a disconnect operator will bind A to 2^256 x B (which
        // will succeed) and then bind 1a to B x 1b (which will fail), so the
        // construction of the disconnect operator should fail.
        Arc::<ConstructNode<Core>>::disconnect(&unit1, &Some(Arc::clone(&unit2))).unwrap_err();
        // But because the construction of the disconnect node failed, it shouldn't
        // pollute the unit; it SHOULD NOT leave A with a product bound.
        assert_eq!(
            unit1.arrow().source.finalize().unwrap(),
            types::Final::unit(),
        );  
    }   

This fails currently, because type inference works by applying bounds one after the other. In the case of combinators like disconnect, it may happen that one bound succeeds while the following one fails, leaving the types of the program in an inconsistent state.

This isn't a super high priority because under "normal" usage once a type inference step fails you should throw the whole program away and restart, and because I have a set of huge PRs coming in that'll reorganize the type checker to no longer leak memory, but it is causing my fuzzer to slow down.

uncomputable added a commit that referenced this issue Jul 4, 2024
c22dbaa types: drop BoundMutex and instead use references into the type context slab (Andrew Poelstra)
2e24c49 types: pull unify and bind into inference context (Andrew Poelstra)
a26cf7a types: remove set and get methods from BoundRef (Andrew Poelstra)
33f58fa types: introduce BoundRef type, use in place of Arc<BoundMutex> in union-bound (Andrew Poelstra)
021316c types: abstract pointer type in union-bound algorithm (Andrew Poelstra)
eccc332 types: add &Context to recursive type constructors (Andrew Poelstra)
65b35a9 types: add &Context to type constructors (Andrew Poelstra)
8e08900 types: make `bind` and `unify` go through Context (Andrew Poelstra)
8eeab8f types: introduce inference context object, thread it through the API (Andrew Poelstra)
9b0790e cmr: pull Constructible impl on Cmr into an impl on an auxiliary type (Andrew Poelstra)

Pull request description:

  Our existing type inference engine assumes a "global" set of type bounds, which has two bad consequences: one is that if you are constructing multiple programs, there is no way to "firewall" their type bounds so that you cannot accidentally combine type variables from one program with type variables from another. You just need to be careful. The other consequence is that if you construct infinitely sized types, which are represented as a reference cycle, the existing inference engine will leak memory.

  To fix this, we need to stop allocating type bounds using untethered `Arc`s and instead use a slab allocator, which allows all bounds to be dropped at once, regardless of their circularity. This should also improve memory locality and our speed, as well as reducing the total amount of locking and potential mutex contention if type inference is done in a multithreaded context.

  This is a 2000-line diff but the vast majority of the changes are "API-only" stuff where I was moving types around and threading new parameters through dozens or hundreds of call sites. I did my best to break everything up into commits such that the big-diff commits don't do much of anything and the real changes happen in the small-diff ones to make review easier.

  By itself, this PR does **not** fix the issue of reference cycles, because it includes an `Arc<Context>` inside the recursive `Type` type itself. Future PRs will:

  * Take a single mutex lock during calls to the top-level `bind` and `unify` calls, so that these all happen atomically, including all recursive calls.
  * Add another intermediate type under `Type` which eliminates the `Arc<Context>` and its potential for circular references. Along the way, make the `Bound` type private, which is not really used outside of the types module anyway.
  * Do "checkpointing" during type inference that makes node construction atomic; this is #226 which is **not** fixed by this PR.
  * (Maybe) move node allocation into the type inference context so that nodes can be slab-allocated as well, which will address #229 "for free" without us figuring out a non-recursive `Drop` impl for `Arc<Node<N>>`.

ACKs for top commit:
  uncomputable:
    ACK c22dbaa

Tree-SHA512: 0fd2fdd9fe3634068d67279d517573df04fafa60b70e432f59417880982ad22e893822362973f946f1deb6279080aec1efdd942dfd8adad81bbddc7d55077336
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant