Bad Hash for storm::RationalNumber in sylvan #618

dbasgoeze · 2024-09-05T20:56:58Z

I've included grid.pm.txt [as .txt to make github happy].

While running storm --prism grid.pm --exact rationals --sound --prop "R{\"x\"}=? [F \"goal\"]" --engine dd I get the error message: BDD Unique table full, 897954 of 67108864 buckets filled!

Taking a closer look 897954 / 67108864 is only about 1.34%! With my sylvan knowledge I immediately suspected that the hash used is not good enough and leads to many collisions. Investigation soon pointed to the hash of the RationalNumbers used in the ADD, calculated in this line:

storm/resources/3rdparty/sylvan/src/storm_wrapper.cpp

Line 138 in 0b1cae2

    
           return seed ^ (std::hash<storm::RationalNumber>()(srn_a) + 0x9e3779b9 + (seed<<6) + (seed>>2));

Changing the line the following fixes the issue:

return seed + std::hash<std::string>{}(srn_a.get_str());

Now storm actually runs OOM like I would've expected (therefore clearly using more of the unique table than 2%). Of course this is not a great fix as first std::hash still does not guarantee to be a good hash, second creating the string is costly and third + while better than ^ imho is still not a good hash combine function.

The following assumes that storm::RationalNumber is always a GMP type.

One could hash the numerator and denumerator arrays in the underlying mpq_t but GMP uses some small fraction optimizations (something like storing the values in the space for the pointers to the arrays I suppose), so this is rather tricky to do. Another approach is to convert the number to double and hash it (but don't use std::hash as I think it more often than not it just uses the identity hash which has bad properties for the hash map used in sylvan) but that approach has the risk of creating collisions if the model uses a lot numbers closer to each other than the resolution of double (which I suppose is not that uncommon). In another project I use a combination of all three approaches.

Another important thing is to not hash the value of the pointers in the mpq_t as then equal but different numbers have different hashes. I speak from experience...

Edit: I was wrong with the assumption that GMP uses small object optimizations. It's just that mpz_t._mp_size is negative when the number is negative.

The text was updated successfully, but these errors were encountered:

sjunges · 2024-09-07T11:16:37Z

Our hash function looks an awful loot like the scramble here: https://daniel-strecker.com/blog/2020-07-06_c++_hash_for_gmp_mpz_t_mpz_class/. We could experiment with that hash function

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad Hash for storm::RationalNumber in sylvan #618

Bad Hash for storm::RationalNumber in sylvan #618

dbasgoeze commented Sep 5, 2024 •

edited

Loading

sjunges commented Sep 7, 2024

Bad Hash for storm::RationalNumber in sylvan #618

Bad Hash for storm::RationalNumber in sylvan #618

Comments

dbasgoeze commented Sep 5, 2024 • edited Loading

The following assumes that storm::RationalNumber is always a GMP type.

sjunges commented Sep 7, 2024

dbasgoeze commented Sep 5, 2024 •

edited

Loading