Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid redundant variables and clauses in the propositional encoding #7001

Draft
wants to merge 19 commits into
base: develop
Choose a base branch
from

Conversation

tautschnig
Copy link
Collaborator

Please review commit-by-commit, with some changes likely being more controversial than others. Preliminary data is that regression/cbmc/ has four cases where fewer SAT solver iterations are required, and even discounting these the total number of variables decreases from 6573323 to 6206840 (-5.9%), the number of clauses decreases from 23756793 to 22971234 (-3.4%) (comparing against current develop, i.e., 09f91ee). A proper performance evaluation is yet to be done.

  • Each commit message has a non-empty body, explaining why the change was made.
  • n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@codecov
Copy link

codecov bot commented Jul 7, 2022

Codecov Report

Merging #7001 (07378c7) into develop (a326820) will decrease coverage by 0.03%.
The diff coverage is 96.66%.

@@             Coverage Diff             @@
##           develop    #7001      +/-   ##
===========================================
- Coverage    77.86%   77.82%   -0.04%     
===========================================
  Files         1569     1567       -2     
  Lines       180995   181091      +96     
===========================================
+ Hits        140929   140937       +8     
- Misses       40066    40154      +88     
Impacted Files Coverage Δ
src/solvers/flattening/bv_utils.h 86.20% <ø> (+0.49%) ⬆️
src/solvers/sat/satcheck_minisat2.cpp 62.91% <ø> (-0.25%) ⬇️
src/solvers/smt2/smt2_conv.cpp 68.78% <90.90%> (+0.02%) ⬆️
src/solvers/flattening/bv_utils.cpp 79.28% <95.83%> (-0.33%) ⬇️
src/goto-symex/build_goto_trace.cpp 87.00% <100.00%> (-0.63%) ⬇️
src/goto-symex/symex_target_equation.cpp 95.08% <100.00%> (-0.27%) ⬇️
src/solvers/flattening/boolbv.cpp 87.08% <100.00%> (ø)
src/solvers/flattening/boolbv_map.cpp 74.41% <100.00%> (+0.60%) ⬆️
src/solvers/flattening/bv_pointers.cpp 85.62% <100.00%> (+0.03%) ⬆️
src/solvers/prop/prop.h 85.36% <100.00%> (+1.58%) ⬆️
... and 43 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@tautschnig tautschnig force-pushed the cleanup/improve-encoding branch 5 times, most recently from 7c83918 to 627f38b Compare July 14, 2022 22:02
@tautschnig tautschnig force-pushed the cleanup/improve-encoding branch from 627f38b to 0985628 Compare July 20, 2022 07:35
@martin-cs
Copy link
Collaborator

Interesting but seems to have quite a few unrelated commits in it. Like changing C++ version. Is this intentional?

@tautschnig
Copy link
Collaborator Author

Interesting but seems to have quite a few unrelated commits in it. Like changing C++ version. Is this intentional?

Temporarily so: yes. I am experimenting with different solvers, and Mergesat is one of them. The two commits at the beginning of the branch are those in #6439.

tautschnig and others added 18 commits August 12, 2022 12:36
This raises the minimum GCC version supported to 7 (see
https://gcc.gnu.org/projects/cxx-status.html#cxx17).

Add -Wno-register to silence warnings about the use of the `register`
storage class by older flex versions (as present on macOS).

CMake doesn't know about C++ 17 until version 3.8, so we need to
hard-code the necessary command-line options.
MergeSat is a recent SAT solver that fits the MiniSat2 interface. Run
the check-ubuntu-20_04-cmake-gcc CI job with MergeSat to continuously
confirm that this configuration actually works.

To use MergeSat as a proper SAT backend, some extensions might be dropped.
Especially being able to print memory usage is not helpful for CBMC, but
requires pulling in a new dependency.

Signed-off-by: Norbert Manthey <[email protected]>
Testing thousands of configuration this appeared to improve performance.
Avoid calling `lcnf` with constants when the actual solving back-end
will just filter them out.
Don't make every caller create an uninitialised literal.
We can safely record the values of expressions by adding `expr == expr`
as constraints in order to be able to fetch and display them in the GOTO
trace. This was already being done for declarations. Introducing new
symbols just adds unnecessary variables to the formula.
Avoid creating equalities over the postponed bitvector when the object
literals trivially aren't equal, and directly encode bitwise equality
when the object literals are trivially equal (and stop searching for a
matching object). In all other cases, avoid unnecessary Tseitin
variables to encode the postponed bitvector equality.
There is no need to create any further bit-level equalities when the
result will be false anyway. On aws-c-common/aws_hash_iter_next, this
reduces the number of Boolean variables from 2291304 to 584628, and the
time spent in CaDiCaL from 191 seconds to 22 seconds.
This avoids creating a variable that is equal to the truth value of the
equality. While at it, also add a call to merge_irep to clean up the
equality expression that was created.
convert_assumptions already took care of the conversion, and we can just
use the resulting literal to force that to true.
Converting assignments (equalities) permits the optimisation of re-using
variables generated for a right-hand side to represent the left-hand
side. Consequently, no free variables need to be introduced for those
left-hand sides when the symbol is seen. If converting, e.g., guards
before assignments, no such optimisation is possible.
There is no need to call set_literals for `A == A`, and if we end up
in set_literals via another code path we can still avoid generating the
trivially-true equality of `l == l`.
Instead of the dance of creating free variables and then setting them
equal to the result of some circuit just pass in the known output
signals and create the circuit using those directly.
This reverts commit 90ba9f9d2c0e61287723199f06304becd5a4d24d.
This reduces the number of Boolean variables at the expense of
introducing more clauses. No measurable difference with CaDiCaL, minor
slowdown with MiniSat.
This makes solving trivial in presence of temporary intermediate
variables.
@tautschnig tautschnig force-pushed the cleanup/improve-encoding branch from a366df2 to 07378c7 Compare August 12, 2022 12:55
@thomasspriggs
Copy link
Contributor

There is a fix for the compatibility between this PR and the new SMT backend here - thomasspriggs@00c420b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants