-
Notifications
You must be signed in to change notification settings - Fork 273
Compact propositional encoding of OBJECT_SIZE(ptr) #7702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compact propositional encoding of OBJECT_SIZE(ptr) #7702
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #7702 +/- ##
===========================================
+ Coverage 79.09% 79.10% +0.01%
===========================================
Files 1699 1699
Lines 196512 196575 +63
===========================================
+ Hits 155428 155508 +80
+ Misses 41084 41067 -17 ☔ View full report in Codecov by Sentry. |
0bfe1a9
to
1545504
Compare
Experiments on a FreeRTOS proof that uses contracts (which in turn introduces a lot of r_ok instances) shows a reduction too:
|
On other models I observed up to 3x regressions on the runtime so think we should make this an opt-in feature with a command line switch. |
For the record: this is the
So I do acknowledge the slow-down with CaDiCaL on this benchmark (just in case: MiniSat is not able to solve these benchmarks in a reasonable amount of time), but I'd argue that in other cases the proposed change makes previously infeasible benchmarks newly solvable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tentative yes. Concerns:
- Does this need a more extensive benchmarking run?
- I feel like this is not the only place this technique could or should be used. I have pondered doing this for constant arrays or even arrays in general. This makes me wonder whether we need a general interface for "compact encoding of a map" and whether it is worth putting some time (experimental or literature) into finding a good BDD ordering and encoding technique.
For each OBJECT_SIZE expression to be encoded by the propositional back-end we previously created a select statement over all objects, mapping a match to the object's size. This would yield a number of constraints linear in the number of objects times the number of OBJECT_SIZE expressions. For a particular benchmark at hand these were 2047 OBJECT_SIZE expressions over 1669 objects. Post-processing exceeded 30GB of memory after several minutes of constructing constraints. For programs that number of distinct object sizes will be small. In the above benchmark there are only 8 distinct object sizes (over those 1669 objects). The new encoding groups objects by size by forming a disjunction of the object identifiers. BDDs are used to compute compact representations of these disjunctions. With this new approach, post-processing on the above benchmark completes in less than 10 seconds and memory never exceeds 3GB. While at it, use the same trick for compacting the encoding of IS_DYNAMIC_OBJECT(ptr).
1545504
to
c53b111
Compare
More benchmarking has now been done across 582 harnesses that use CBMC in GitHub's CI. Across these, develop at revision caf9768 requires 16451 seconds when using MiniSat as back-end. With changes in this PR, the total time is 13405 seconds. There are instances of slow-downs, but when that is the case the combination of the changes proposed in PRs #7996, #8041, #8042 on top of this PR make the overall execution faster again. This demonstrates that on such PRs the time required by MiniSat is highly dependent on peculiarities of the encoding. The maximum observed slowdown is one case going from 173 seconds up to 394 seconds. The maximum speed-up, however, is getting from 1006 seconds down to 279 seconds. On the same benchmarks and CBMC versions but using CaDiCaL as back-end it takes 29584 seconds with develop, and 30759 seconds with this PR. So it may still be possible to further investigate optimisations of the encoding as proposed by @martin-cs, but the current state appears to be a reasonable starting point. |
For each OBJECT_SIZE expression to be encoded by the propositional back-end we previously created a select statement over all objects, mapping a match to the object's size. This would yield a number of constraints linear in the number of objects times the number of OBJECT_SIZE expressions.
For a particular benchmark at hand these were 2047 OBJECT_SIZE expressions over 1669 objects. Post-processing exceeded 30GB of memory after several minutes of constructing constraints.
For programs that number of distinct object sizes will be small. In the above benchmark there are only 8 distinct object sizes (over those 1669 objects). The new encoding groups objects by size by forming a disjunction of the object identifiers. BDDs are used to compute compact representations of these disjunctions.
With this new approach, post-processing on the above benchmark completes in less than 10 seconds and memory never exceeds 3GB.
While at it, use the same trick for compacting the encoding of IS_DYNAMIC_OBJECT(ptr).