Skip to content

Performance problem with --pointer-checks, --pointer-primitive-checks #7378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
remi-delmas-3000 opened this issue Nov 23, 2022 · 6 comments
Closed
Assignees
Labels
aws Bugs or features of importance to AWS CBMC users aws-medium Performance Optimisations

Comments

@remi-delmas-3000
Copy link
Collaborator

remi-delmas-3000 commented Nov 23, 2022

This branch contains the problematic model https://github.com/remi-delmas-3000/cbmc/tree/record-write-v-minimized

The s2n_record_writev.c model reveals several problems, described in regression/contracts/s2n_record_writev/RUNME.sh.

In short:

  • translating the model with goto-cc s2n_record_writev.c produces a model a.out that can be analysed directly by cbmc with pointer checks and pointer primitive checks without problems.
  • running a single pass of goto-instrument a.out b.out without options turns a.out into a slightly different model b.out that makes cbmc --pointer-checks --pointer-primitive-checks b.out explode in size during CNF translation
  • A nondet pointer assignment to a pointer packed in a struct s2n_hash_state seems to have a side effect on another pointer of the struct (removing the nondet assignment from the C file solves all performance problems);
  • Slight alterations of the PO formulation assert(pointer_offset(object) >= constant) instead of assert(!(pointer_offset(object) < constant)) seems to result in large CNF size differences;
  • Possibly lacking common subexpression reuse in automatically generated rw_ok checks and pointer checks (the expression of the pointer being checked contains byte_extracts and has multiple individual occurrences in the check expression.);
  • What could also be happening is that the goto-instrument pass expands the rw_ok assertion contained in a.out and that the resulting expressions in b.out get themselves instrumented as if they were user code. (this could explain the difference in number of generated POs between a.out (Generated 445 VCC(s), 19 remaining after simplification) and b.out (Generated 529 VCC(s), 103 remaining after simplification)).

It could be possible to add disable:pointer-check, disable:pointer-primitive-check, disable:pointer-overflow-check pragmas to avoid the expanded expression being itself instrumented in later passes (if this is what happens).

CBMC version: 5.70 +
Operating system: linux, macOS
Exact command line resulting in the issue: see regression/contracts/s2n_record_writev/RUNME.sh
What behaviour did you expect: analysis terminates
What happened instead: analysis saturates RAM (128gigs)

@jimgrundy
Copy link
Collaborator

Can we get a status update?

@feliperodri
Copy link
Collaborator

feliperodri commented Feb 27, 2023

@tautschnig is the root cause the same as #7377?

@thomasspriggs
Copy link
Contributor

#7395 is a step towards this but not a complete solution.

@tautschnig
Copy link
Collaborator

#7702 might be the next piece in this puzzle. Evaluation on the benchmarks of this issue yet to be done.

@tautschnig
Copy link
Collaborator

A belated update on this:

  1. Move CPROVER_{r,w,rw}_ok processing to back-end #7395 addresses the problem of C source vs GOTO having different performance characteristics.
  2. There are several smaller improvements that substantially speed up symbolic execution (Value sets: make pointers nondet only once #7716 makes the biggest difference here, and has an interplay with Move CPROVER_{r,w,rw}_ok processing to back-end #7395: with that patch, we have substantially fewer calls to dereferencing).
  3. The challenging part in producing the CNF encoding are byte extracts with non-constant offsets, or more specifically offsets that are POINTER_OFFSET(ptr). (The propositional encoding thereof is a case split over all possible byte offsets -- it's not clear to me that we could do much better than that.) Those arise from non-deterministic pointers, which the value set treats as pointers pointing to any known object at an arbitrary offset. That "arbitrary offset" is what dereferencing then (has to) resolve as POINTER_OFFSET(ptr).

I'd like to propose the following, and would like to get input from @jimgrundy @kroening @remi-delmas-3000 as well as anyone else who would like to voice their opinion:

  1. I will revise Move CPROVER_{r,w,rw}_ok processing to back-end #7395 and Value sets: make pointers nondet only once #7716 to bring them into a state where they can be merged. Both of them appear to be desirable improvements, just the exact implementation needs to be adjusted.
  2. Review the performance of the benchmark named in this issue. If deemed adequate, we can stop here.
  3. Else, we should consider a fixed zero offset for nondet-pointers in value sets. While this makes them a bit more deterministic, we need to acknowledge that nondet-pointers (as introduced in enable nondeterministic pointers #6326) always were an under-approximation.

@tautschnig
Copy link
Collaborator

Resolving as of 6f3c16c: running goto-instrument after compilation no longer affects verification time, both with and without goto-instrument cbmc --pointer-primitive-check now takes approximately 30 seconds on this example, and memory consumption is 7.6 GB. cbmc --pointer-check only takes 6 seconds and requires less than 1 GB of memory (with and without goto-instrument in between).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users aws-medium Performance Optimisations
Projects
None yet
Development

No branches or pull requests

5 participants