Compact propositional encoding of OBJECT_SIZE(ptr) #7702

tautschnig · 2023-05-05T21:48:59Z

For each OBJECT_SIZE expression to be encoded by the propositional back-end we previously created a select statement over all objects, mapping a match to the object's size. This would yield a number of constraints linear in the number of objects times the number of OBJECT_SIZE expressions.

For a particular benchmark at hand these were 2047 OBJECT_SIZE expressions over 1669 objects. Post-processing exceeded 30GB of memory after several minutes of constructing constraints.

For programs that number of distinct object sizes will be small. In the above benchmark there are only 8 distinct object sizes (over those 1669 objects). The new encoding groups objects by size by forming a disjunction of the object identifiers. BDDs are used to compute compact representations of these disjunctions.

With this new approach, post-processing on the above benchmark completes in less than 10 seconds and memory never exceeds 3GB.

While at it, use the same trick for compacting the encoding of IS_DYNAMIC_OBJECT(ptr).

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

codecov · 2023-05-05T22:46:57Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (74075ec) 79.09% compared to head (c53b111) 79.10%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #7702      +/-   ##
===========================================
+ Coverage    79.09%   79.10%   +0.01%     
===========================================
  Files         1699     1699              
  Lines       196512   196575      +63     
===========================================
+ Hits        155428   155508      +80     
+ Misses       41084    41067      -17

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

remi-delmas-3000 · 2023-05-08T20:29:25Z

Experiments on a FreeRTOS proof that uses contracts (which in turn introduces a lot of r_ok instances) shows a reduction too:

10% less variables
30% less clauses,
peak SAT memory consumption decreased from 6G to 700M
runtime decreased from 1300s to 500s

remi-delmas-3000 · 2023-05-09T13:44:50Z

On other models I observed up to 3x regressions on the runtime so think we should make this an opt-in feature with a command line switch.

tautschnig · 2023-05-17T08:46:34Z

On other models I observed up to 3x regressions on the runtime so think we should make this an opt-in feature with a command line switch.

For the record: this is the list_is_valid task from https://github.com/remi-delmas-3000/unbounded-proofs-with-cbmc, where CBMC has the following resource usage:

develop:
- SAT solver CaDiCaL: 37 seconds, 1.8 GB of memory
- SAT solver kissat: 43 seconds, 800 MB of memory
this branch (BDDs):
- SAT solver CaDiCaL: 103 seconds, 2.2 GB of memory
- SAT solver kissat: 40 seconds, 788 MB of memory

So I do acknowledge the slow-down with CaDiCaL on this benchmark (just in case: MiniSat is not able to solve these benchmarks in a reasonable amount of time), but I'd argue that in other cases the proposed change makes previously infeasible benchmarks newly solvable.

martin-cs

Tentative yes. Concerns:

Does this need a more extensive benchmarking run?
I feel like this is not the only place this technique could or should be used. I have pondered doing this for constant arrays or even arrays in general. This makes me wonder whether we need a general interface for "compact encoding of a map" and whether it is worth putting some time (experimental or literature) into finding a good BDD ordering and encoding technique.

For each OBJECT_SIZE expression to be encoded by the propositional back-end we previously created a select statement over all objects, mapping a match to the object's size. This would yield a number of constraints linear in the number of objects times the number of OBJECT_SIZE expressions. For a particular benchmark at hand these were 2047 OBJECT_SIZE expressions over 1669 objects. Post-processing exceeded 30GB of memory after several minutes of constructing constraints. For programs that number of distinct object sizes will be small. In the above benchmark there are only 8 distinct object sizes (over those 1669 objects). The new encoding groups objects by size by forming a disjunction of the object identifiers. BDDs are used to compute compact representations of these disjunctions. With this new approach, post-processing on the above benchmark completes in less than 10 seconds and memory never exceeds 3GB. While at it, use the same trick for compacting the encoding of IS_DYNAMIC_OBJECT(ptr).

tautschnig · 2024-01-22T14:50:16Z

More benchmarking has now been done across 582 harnesses that use CBMC in GitHub's CI. Across these, develop at revision caf9768 requires 16451 seconds when using MiniSat as back-end. With changes in this PR, the total time is 13405 seconds. There are instances of slow-downs, but when that is the case the combination of the changes proposed in PRs #7996, #8041, #8042 on top of this PR make the overall execution faster again. This demonstrates that on such PRs the time required by MiniSat is highly dependent on peculiarities of the encoding. The maximum observed slowdown is one case going from 173 seconds up to 394 seconds. The maximum speed-up, however, is getting from 1006 seconds down to 279 seconds.

On the same benchmarks and CBMC versions but using CaDiCaL as back-end it takes 29584 seconds with develop, and 30759 seconds with this PR.

So it may still be possible to further investigate optimisations of the encoding as proposed by @martin-cs, but the current state appears to be a reasonable starting point.

tautschnig added the aws Bugs or features of importance to AWS CBMC users label May 5, 2023

tautschnig requested a review from martin-cs as a code owner May 5, 2023 21:49

tautschnig self-assigned this May 5, 2023

tautschnig requested review from kroening and peterschrammel as code owners May 5, 2023 21:49

tautschnig assigned kroening, peterschrammel and martin-cs May 5, 2023

tautschnig mentioned this pull request May 5, 2023

Performance problem with --pointer-checks, --pointer-primitive-checks #7378

Closed

tautschnig force-pushed the cleanup/object-size-encoding branch 2 times, most recently from 0bfe1a9 to 1545504 Compare May 8, 2023 10:48

tautschnig removed their assignment May 8, 2023

tautschnig self-assigned this May 15, 2023

tautschnig removed their assignment May 17, 2023

feliperodri assigned remi-delmas-3000 Aug 7, 2023

martin-cs approved these changes Sep 18, 2023

View reviewed changes

martin-cs removed their assignment Sep 18, 2023

tautschnig assigned tautschnig and unassigned kroening, remi-delmas-3000 and peterschrammel Sep 18, 2023

tautschnig force-pushed the cleanup/object-size-encoding branch from 1545504 to c53b111 Compare November 15, 2023 16:20

tautschnig merged commit 9220c3a into diffblue:develop Jan 22, 2024

tautschnig deleted the cleanup/object-size-encoding branch January 22, 2024 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Compact propositional encoding of OBJECT_SIZE(ptr) #7702

Compact propositional encoding of OBJECT_SIZE(ptr) #7702

tautschnig commented May 5, 2023 •

edited

Loading

Uh oh!

codecov bot commented May 5, 2023 •

edited

Loading

Uh oh!

remi-delmas-3000 commented May 8, 2023

Uh oh!

remi-delmas-3000 commented May 9, 2023

Uh oh!

tautschnig commented May 17, 2023

Uh oh!

martin-cs left a comment

Uh oh!

tautschnig commented Jan 22, 2024

Uh oh!

Uh oh!

Compact propositional encoding of OBJECT_SIZE(ptr) #7702

Compact propositional encoding of OBJECT_SIZE(ptr) #7702

Conversation

tautschnig commented May 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented May 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

remi-delmas-3000 commented May 8, 2023

Uh oh!

remi-delmas-3000 commented May 9, 2023

Uh oh!

tautschnig commented May 17, 2023

Uh oh!

martin-cs left a comment

Choose a reason for hiding this comment

Uh oh!

tautschnig commented Jan 22, 2024

Uh oh!

Uh oh!

tautschnig commented May 5, 2023 •

edited

Loading

codecov bot commented May 5, 2023 •

edited

Loading