Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

singletons has nondeterministic type variable orderings #367

Closed
RyanGlScott opened this issue Oct 11, 2018 · 4 comments · Fixed by #386
Closed

singletons has nondeterministic type variable orderings #367

RyanGlScott opened this issue Oct 11, 2018 · 4 comments · Fixed by #386
Labels

Comments

@RyanGlScott
Copy link
Collaborator

RyanGlScott commented Oct 11, 2018

It occurred to me recently that the sorts of nondeterminism issues that plagued GHC for years also plague singletons. While GHC has fixed these issues, singletons has not. The following example illustrates the problem well:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
module Bug where

import Data.Singletons
import Data.Singletons.TH (singletonsOnly)

$(singletonsOnly [d|
  const :: a -> b -> a
  const x _ = x
  |])

When a user writes sConst @Bool, which type gets instantiated to Bool: a or b? You might think "well obviously the answer is a", and a quick smoke-test in GHCi would appear to confirm that hunch:

$ /opt/ghc/8.6.1/bin/ghci Bug.hs -XTypeApplications
GHCi, version 8.6.1: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /home/ryanglscott/.ghci
[1 of 1] Compiling Bug              ( Bug.hs, interpreted )
Ok, one module loaded.
λ> :type +v sConst @Bool
sConst @Bool
  :: forall b (t1 :: Bool) (t2 :: b). Sing t1 -> Sing t2 -> Sing t1

However, that's only because of a lucky fluke. As it turns out, if you pass -dunique-increment=-1 to GHC, then you'll get a different answer:

$ /opt/ghc/8.6.1/bin/ghci Bug.hs -XTypeApplications -dunique-increment=-1
GHCi, version 8.6.1: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /home/ryanglscott/.ghci
[1 of 1] Compiling Bug              ( Bug.hs, interpreted )
Ok, one module loaded.
λ> :type +v sConst @Bool
sConst @Bool
  :: forall a (t1 :: a) (t2 :: Bool). Sing t1 -> Sing t2 -> Sing t1

Ack. It turns out that the order of the kind variables a and b in sConst's type signature completely depend on which unique values happen to be gensymmed for them (by newName) in singletons' Template Haskell machinery. This is because several functions in th-desugar and singletons collect type variables using Set Name, and the Ord Name instance compares by Name's unique values first and foremost. If a happens to have a smaller unique than b, then when you call Set.toList on a set containing {a, b}, then you'll get [a,b]. If b has a smaller unique than a, on the other hand, then you'll get [b,a].

Essentially, any place in singletons where call S.toList on a Set Name (that can affect the user-facing API) should be treated as nondeterministic code. I count at least three places in th-desugar and singletons (each!) where we do this at the time of writing.

Luckily, GHC has figured out how to solve this problem, so we should be able to learn from its mistakes. One of the main tricks that GHC uses to avoid this sort of nondeterminism is by switching out uses of Set for a deterministic, insert-ordered set, such as UniqDSet. It seems likely that we'll need to find such a data structure on Hackage (or conjure up one ourselves) and use it throughout th-desugar and singletons where appropriate.

@goldfirere
Copy link
Owner

Oof. Good point.

I'm not convinced we need to go with something like UniqDSet though. Let's just use a list! We're generally talking about very small numbers of elements, and performance of these libraries is not critical (as it's all at compile time). Indeed, given the small size of the collections involved, switching to lists might be a performance improvement in practice!

@RyanGlScott
Copy link
Collaborator Author

We're generally talking about very small numbers of elements, and performance of these libraries is not critical

I want to agree with you here, but something doesn't quite sit right with me about this plan. I think the thing that bugs me is the fact that we'd need to change the type of fvDType (from th-desugar) from Set Name to [Name]. In order to accomplish that, we'd need to ensure the list of Names has no duplicates—in other words, we'd need to nub the final result (which has quadratic time). There are several places in singletons where foldMap fvDType is a common idiom, which mean the time complexity of those parts of the code can now be cubic!

I could (reluctantly) get on board with a quadratic-time algorithm, but going cubic or higher generally indicates to me that whatever approach I'm using is not going to scale at all. It's true that pretty much no test case in singletons would be affected that much, but I can almost guarantee that someone's singletons code in the wild would blow up as a result of this.

I actually don't think it would be that hard to whip up an insert-ordered Set, so I'd be inclined to at least give that a try before settling for lists.

(as it's all at compile time).

True, but that doesn't mean people aren't sensitive to increasing compile times. Take this issue, for example :)

@goldfirere
Copy link
Owner

OK, if you want to cook up the fancy data structure, I won't stop you.

As for foldMap fvDType: that's easy to fix. Just have a variant of fvDType that doesn't nub. Fold the non-nubbing one, and then nub at the end. We're back in the relative comfort of quadratic time.

@RyanGlScott
Copy link
Collaborator Author

It turns out that there is a Hackage library that's perfect for our needs: ordered-containers, which contains implementations of insertion-ordered Maps and Sets (called OMaps and OSets in that package). I briefly sketched out an implementation of th-desugar/singletons based on ordered-containers, but ran into some issues that I had to work around:

  1. There's no union equivalent for OMaps. (I opened an issue about this here.)
  2. There's no Data instance for OSet. (I opened an issue about this here.)
  3. There's no Functor instance for OMap. (I opened an issue about this here.)
  4. There's no Semigroup or Monoid instances for OSet. (Issue here.)

Once the issues above become resolved, we can move forward with this. Alternatively, if we don't want to wait, we could always inline the implementations of OMap and OSet directly into th-desugar, since they're not that much code, and we could replace it with ordered-containers once a suitable version appears with the features that we need.

RyanGlScott added a commit that referenced this issue Feb 26, 2019
[ci skip]
RyanGlScott added a commit that referenced this issue Feb 27, 2019
There were several places in `singletons` that would compute entirely
different things depending on the order in which GHC's unique numbers
happened to be computed. Like the corresponding `th-desugar` patch
(in goldfirere/th-desugar#115), this fixes these issues by swapping
out nondeterministic uses of `Map` and `Set` with `th-desugar`'s new
`OMap` and `OSet` data structures, which remember the order in which
elements were inserted.

This patch looks large, but half of the modifications are routine
changes brought about by switching data structures, and the other
half are test suite wibbles brought about by `singletons` settling
on a deterministic order for the expected output. The upshot is that
we can finally run the `singletons` test suite with
`-dunique-increment=-1` and have it still pass! Hooray!

Fixes #367.
RyanGlScott added a commit that referenced this issue Mar 5, 2019
There were several places in `singletons` that would compute entirely
different things depending on the order in which GHC's unique numbers
happened to be computed. Like the corresponding `th-desugar` patch
(in goldfirere/th-desugar#115), this fixes these issues by swapping
out nondeterministic uses of `Map` and `Set` with `th-desugar`'s new
`OMap` and `OSet` data structures, which remember the order in which
elements were inserted.

This patch looks large, but half of the modifications are routine
changes brought about by switching data structures, and the other
half are test suite wibbles brought about by `singletons` settling
on a deterministic order for the expected output. The upshot is that
we can finally run the `singletons` test suite with
`-dunique-increment=-1` and have it still pass! Hooray!

Fixes #367.
RyanGlScott added a commit that referenced this issue Mar 5, 2019
There were several places in `singletons` that would compute entirely
different things depending on the order in which GHC's unique numbers
happened to be computed. Like the corresponding `th-desugar` patch
(in goldfirere/th-desugar#115), this fixes these issues by swapping
out nondeterministic uses of `Map` and `Set` with `th-desugar`'s new
`OMap` and `OSet` data structures, which remember the order in which
elements were inserted.

This patch looks large, but half of the modifications are routine
changes brought about by switching data structures, and the other
half are test suite wibbles brought about by `singletons` settling
on a deterministic order for the expected output. The upshot is that
we can finally run the `singletons` test suite with
`-dunique-increment=-1` and have it still pass! Hooray!

Fixes #367.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants