Skip to content

Add validity checks for IntSet and IntMap #456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 30, 2017

Conversation

m-renaud
Copy link
Contributor

Codify the invariants outlined in comments and add property checks to verify that the internal structure of IntSet and IntMap are correct under initial construction, union, intersection, and difference.

This addresses #304.

The checks codify the invariants laid out in the IntSet and IntMap comments. We
add validity checks for constructed IntSets and IntMaps as well as union,
intersection, and difference operations on them.

As it turns out, the mask being a power of 2 checks do not hold for the IntMap
implementation; this may be something to investigate.
Also moves this to a where clause to keep it local to the function in which its used.
Copy link
Contributor

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far! I'd really like a validity check added to the tests for each operation that produces a set/map. Thankfully, strengthening those checks should be a lot easier than what you've already done.

-- Invariant: Prefix is the common high-order bits that all elements share to
-- the left of the Mask bit.
-- Invariant: In Bin prefix mask left right, left consists of the elements that
-- don't have the mask bit set; right is all the elements that do.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps explain what a mask looks like?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -3396,3 +3402,50 @@ node = "+--"
withBar, withEmpty :: [String] -> [String]
withBar bars = "| ":bars
withEmpty bars = " ":bars

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these assertions be here or under tests? I'm asking because I don't know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following structure that Data.Set used[1], but I'm open to moving them somewhere else; perhaps in a follow-up PR though.

[1] https://github.com/haskell/containers/blob/master/Data/Set/Internal.hs#L1758

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through the implementations I don't think there's anything that requires them to be defined here since the internals of the types are exposed by the Internal module. I think I would prefer to do a mass move across all the data structures in a follow-up though.

-- Invariant: The Prefix is zero for all but the last 5 (on 32 bit arches) or 6
-- bits (on 64 bit arches). The values of the map represented by a tip
-- Invariant: The Prefix is zero for the last 5 (on 32 bit arches) or 6 bits
-- (on 64 bit arches). The values of the set represented by a tip
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this comment just flat-out wrong before? That's confusing and disturbing!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to have been. It took me a little bit to realize that I was implementing the incorrect check :P I went and read the implementation closer and figured that this is I think what it meant to say.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the test that codifies this interpretation passes, sooo...

@m-renaud
Copy link
Contributor Author

Note to reviewers:

  1. I'm not sure if there's a style guide for the package that I've violated or not. I tried to format things like the surrounding code, feel free to nit pick it though :)

  2. One thing that struck me as strange is that some of the invariants that held for the IntSet relating to properties of the mask and prefix did not hold for the IntMap implementation (or I may have made a mistake porting them over). Specifically the maskPowerOfTwo property did not hold for Mask in the IntMaps Bin constructor. Also, the tipsValid check didn't appear to have a passing analogue.

@treeowl
Copy link
Contributor

treeowl commented Dec 22, 2017 via email

@m-renaud
Copy link
Contributor Author

Great work so far!

Thanks! It's been fun ;)

I'd really like a validity check added to the tests for each operation that produces a set/map

Added a bunch more validity checks, see commit for details. Let me know if there's any others you'd like me to add :)

Perhaps explain what a mask looks like?

I've added a short description given what I understand about the implementation, it's a little hand wavey though. #404 would definitely have made this easier ;P

IntSet: split, splitMember
IntMap: alter, split
@m-renaud
Copy link
Contributor Author

The CI failures appear to be due to configuration failing and not due to this change.

@treeowl
Copy link
Contributor

treeowl commented Dec 22, 2017 via email

@m-renaud
Copy link
Contributor Author

Ugh. I don't want to hear that!

:(

The command "cabal install 'test-framework >= 0.3.3' 'test-framework-quickcheck2 >= 0.2.9' 'QuickCheck >= 2.4.0.1' 'ChasingBottoms' 'HUnit' 'test-framework-hunit'" failed and exited with 1 during .

Have you had a chance to look at what invariants you're missing in IntMap?

Not closely, the implementation is slightly different (wrt keys) than the IntSet implementation. I'll have to take another pass through but I didn't see anything that the code was trying to enforce, there also aren't any comments though so I could very well be missing some.

@m-renaud
Copy link
Contributor Author

Went through the IntMap code again and that looks like all the invariants. Tip is much simpler in the IntMap case since it only ever holds a single (key, value) pair. Let me know if there are any other checks you'd like me to add. If I see anything I missed when making further changes I'll send another PR.

Copy link
Contributor

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oy. I didn't see what you were doing clearly enough and have now wasted some of your time. I'm sorry. I meant that the validity checks should be incorporated into the other tests. That is, something like

prop_partition :: IntSet -> Int -> Property
prop_partition s i = case partition odd s of
		     (s1,s2) -> valid s1 .&&. valid s2 .&&. (all odd (toList s1) && all even (toList s2)) .&&. s === s1 `union` s2 

We also almost certainly want to make valid return a Property so it can explain what was invalid about a bad argument.

@m-renaud
Copy link
Contributor Author

Ah gotcha, no worries, that makes more sense than what I did :)

@m-renaud
Copy link
Contributor Author

We also almost certainly want to make valid return a Property

Absolutely, although having something return a Property living in the Internal module seems weird, which gives more credence to the idea that the valid function maybe shouldn't live where it is today.

tests/ is one option, or we could add a new Data.IntMap.TestUtil in case we decide to expose the internals as a separate package at some point in the future (but its probably best to go with tests/ since there's no concrete plan to export the internals separately).

So, there's two ways to go right now:

  1. Change valid to return a Property (introducing testing code into the actual implementation).
  2. Change it to a Property in the follow-up PR when I move all the valid checks from the various containers to the tests/ directory.

I'm personally leaning towards 2, WDYT?

@treeowl
Copy link
Contributor

treeowl commented Dec 23, 2017 via email

@m-renaud
Copy link
Contributor Author

I'm leaning towards introducing the validity checks for IntMap where we
actually want them

Giving it another thought I completely agree, I'll go with that.

- Moves the `valid` function for IntSet and IntMap into
  tests/IntSetValidity.hs and tests/IntMapValidity.
- Make `valid` check inside existing functions instead of introducing new
  prop_fooValid functions.
- Exposes `zero` from IntSet.Internal so it can be used in IntSetValidity.
@m-renaud
Copy link
Contributor Author

Moved the valid function to a new module in the tests/ directory. Is this an OK way to structure it?

@treeowl
Copy link
Contributor

treeowl commented Dec 27, 2017

I think that's generally okay. I don't know that we really need a separate module, but it shouldn't hurt. However, it shouldn't be a top-level name like that. One option is to add a Data.IntMap.Debug or some such, to include both the validity tests and tree-showing functions.

@treeowl
Copy link
Contributor

treeowl commented Dec 27, 2017

Er... I guess you're keeping the thing non-exposed from the library? If so, I guess it doesn't matter.

@m-renaud
Copy link
Contributor Author

Yeah, my thought was that if we ever decide to move the internals into their own package to be used by other packages then we can export the validity checks, but since only this package should be messing around with the internals I just created a new module in the tests/ directory. Let me know if that's alright with you or if you'd like me to move them somewhere else.

@treeowl
Copy link
Contributor

treeowl commented Dec 28, 2017 via email

@m-renaud
Copy link
Contributor Author

SGTM. Unless you have any other concerns I think this is ready to merge.

valid t
= property (nilNeverChildOfBin t) .&&.
property (commonPrefix t) .&&.
property (maskRespected t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably want to use counterexample rather than property here, to label each type of failure. I wonder if we should push the Property business into commonPrefix and maskRespected so we can see where in the tree the failure lies. That probably shouldn't be necessary for nilNeverChildOfBin, since that will be visually obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to using counterexample for each of these. It would take more work to push Property into the other functions and be done in such a way that we would get actionable information from it, so if its alright with you I'd like to hold off on that for now.

@m-renaud m-renaud merged commit c3bca8f into haskell:master Dec 30, 2017
Nil -> True
Tip _ _ -> True
Bin _ m l r ->
(m `mod` 2 == 0) && maskPowerOfTwo l && maskPowerOfTwo r
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit late to the party, but

m `mod` 2 == 0

is not a way to test that a mask is a power of 2. Alternative: popcount m == 1.

@treeowl
Copy link
Contributor

treeowl commented Jan 18, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants