Fix failed coverage in if then else with return statements #8001

esteffin · 2023-11-07T16:27:51Z

Avoid adding useless goto blocks in if statements with return statements.

When the parsed AST is transformed in GOTO language the if statements are translated in: if (!c) goto else; then_branch; goto endif; else: else_branch; endif: nop and if (!c) goto endif; then_branch; endif: nop.

This PR avoids to add the goto endif command and the endif label if the then_branch already ends with a return or a goto statement.

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
White-space or formatting changes outside the feature-related changed lines are in commits of their own.

When the parsed AST is transformed in GOTO language the if conditions are translated in: `if (!c) goto else; then_branch; goto endif; else: else_branch; endif: nop` and `if (!c) goto endif; then_branch; endif: nop`. This commit avoids to add the `goto endif` command and the `endif` label if the `then_branch` ends with a `return` or a `goto` statement. The benefit is that the `--cover location` command returns a more precise result as unreachable code blocks that are added by CBMC's frontend are not marked as unreachable.

This commit fixes the regression tests failing because of the optimization to the way `if` statements are translated to GOTO. Specifically it fixes - the expected result of `--cover location` in 2 `cbmc-cover` regression tests - the index of blocks in 3 regression tests from `cbmc` and `cprover`.

codecov · 2023-11-07T17:32:30Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (9238a38) 77.75% compared to head (701bf9c) 77.92%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8001      +/-   ##
===========================================
+ Coverage    77.75%   77.92%   +0.17%     
===========================================
  Files         1701     1701              
  Lines       196465   196468       +3     
===========================================
+ Hits        152762   153101     +339     
+ Misses       43703    43367     -336

Files	Coverage Δ
src/goto-programs/goto_convert.cpp	`88.13% <100.00%> (+0.03%)`	⬆️

... and 97 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

esteffin · 2023-11-07T21:25:49Z

@kroening This PR involves some changes to a regression test of the cprover tool.

Can you please have a look that the changes are ok?

kroening · 2023-11-08T07:06:26Z

I am on the fence on this PR. It's of course true that the code after a return won't ever be covered. But where do we draw the line? Say why not include __noreturn functions? This eventually becomes the full program reachability problem, which is what the cover feature is meant to solve.

thomasspriggs · 2023-11-08T20:19:33Z

I am on the fence on this PR. It's of course true that the code after a return won't ever be covered. But where do we draw the line? Say why not include __noreturn functions? This eventually becomes the full program reachability problem, which is what the cover feature is meant to solve.

@kroening I was on the fence as per my comments on #7974
However I have reconsidered since my initial comments, such that I am currently in favour of improving our coverage output based on avoiding introducing unreachable code in conversion to goto. I consider the unreachable goal to be unhelpful in this case because it does not correspond to a user statement. This is even clearer if you consider that the issue affects this variation on the input example without the brackets -

C example

int nondet_int();

int main() {
    if (nondet_int())
        return -1;
    else if (nondet_int())
        return -1;
    else
        return -1;
    return 0;
}

coverage output

** coverage results:
[main.coverage.1] file ./no_bracket_example.c line 4 function main block 1 (lines ./no_bracket_example.c:main:4): SATISFIED
[main.coverage.2] file ./no_bracket_example.c line 5 function main block 2 (lines ./no_bracket_example.c:main:5): SATISFIED
[main.coverage.3] file ./no_bracket_example.c line 5 function main block 3 (lines ./no_bracket_example.c:main:5): FAILED
[main.coverage.4] file ./no_bracket_example.c line 6 function main block 4 (lines ./no_bracket_example.c:main:6): SATISFIED
[main.coverage.5] file ./no_bracket_example.c line 7 function main block 5 (lines ./no_bracket_example.c:main:7): SATISFIED
[main.coverage.6] file ./no_bracket_example.c line 7 function main block 6 (lines ./no_bracket_example.c:main:7): FAILED
[main.coverage.7] file ./no_bracket_example.c line 9 function main block 7 (lines ./no_bracket_example.c:main:9): SATISFIED
[main.coverage.8] file ./no_bracket_example.c line 10 function main block 8 (lines ./no_bracket_example.c:main:10): FAILED
[main.coverage.9] file ./no_bracket_example.c line 11 function main block 9 (lines ./no_bracket_example.c:main:11): SATISFIED

** 6 of 9 covered (66.7%)

In this example there is no closing } for the then/true case which is unreachable as there is no code block in the source code, just a single statement.

This output also demonstrates that we are being inconsistent with how we are defining coverage. We have the helpful result in the case that the final return is unreachable (block 8). There are unhelpful goal results stating that the end of each "then true case" code block is unreachable (blocks 3 and 6). But we are missing the equivalent "unhelpful" result for the unreachable end of the "else false case" code block. Therefore the output appears to be inconsistent from a user/source perspective.

I would agree with you that we should not be partially solving the reachability problem by performing dead code elimination as part of the coverage instrumentation for example. However the way we report coverage at the moment either requires additional manual interpretation of the coverage results or mismatched goal elimination by a post-processing tool. Either solution seems potentially error-prone to me. Therefore I think it would be beneficial if we could do better in terms of our coverage reporting, rather than deferring to solutions outside of cbmc.

With regards to drawing the line between performing a full reachability analysis and improving the coverage output - I think I would currently draw the line at "we should not introduce dead code during during conversion which does not correspond to source language statements or source language branches". For a goal to really be useful it needs to map back to the source language in a way that makes sense. The change proposed in this PR gets us closer to that ideal in my opinion. With the way coverage instrumentation currently fits into the code base fixing this issue in the conversion is about as close as we can get to avoiding performing a reachability analysis to define reachability/coverage goals. The solution would look more like a reachability analysis the later we were to do it. There may be other ways the coverage analysis could be archectected in order to avoid issues like this one. But I don't see these longer term alternatives as a reason not to merge this PR, which is an improvement from a user perspective.

To refer back to the example of __noreturn function calls - my reading of the C standard is that this specifier is treated more as a statement of intent than a guarantee. "The implementation should produce a diagnostic message for a function declared with a _Noreturn function specifier that appears to be capable of returning to its caller." This seems to indicate a warning rather than an error on compiling __noreturn functions containing return statements/reaching the end. As such I don't think we should disregard the possibility of calls to __noreturn functions actually returning at the stage of conversion to goto.

tautschnig

I believe this is an improvement in the right place. It might not be perfect, but it does improve our code generation.

tautschnig · 2023-11-17T09:20:27Z

src/goto-programs/goto_convert.cpp

+  // When the `then` branch of a balanced `if` condition ends with a `return` or
+  // a `goto` statement, it is not necessary to add the `goto z` and `z:` goto
+  // elements as they are never used.
+  // This helps for example when using `--cover location` as such command are
+  // marked unreachable, but are not part of the user-provided code to analyze.
+  bool then_branch_returns = dest.instructions.rbegin()->is_goto();
+


Nit pick: I would suggest to move this block just after CHECK_RETURN(!tmp_w.instructions.empty()); and use tmp_w instead of dest.

Enrico Steffinlongo added 2 commits November 7, 2023 16:19

esteffin requested review from kroening, tautschnig and peterschrammel as code owners November 7, 2023 16:27

esteffin requested a review from a team November 7, 2023 16:28

esteffin mentioned this pull request Nov 7, 2023

GOTO conversion of return inside if-then-else generates unreachable instructions #7974

Closed

thomasspriggs approved these changes Nov 7, 2023

View reviewed changes

TGWDB approved these changes Nov 7, 2023

View reviewed changes

feliperodri added aws Bugs or features of importance to AWS CBMC users aws-high labels Nov 13, 2023

feliperodri requested a review from remi-delmas-3000 November 13, 2023 17:17

feliperodri assigned peterschrammel and tautschnig Nov 13, 2023

tautschnig approved these changes Nov 17, 2023

View reviewed changes

tautschnig assigned esteffin and unassigned tautschnig and peterschrammel Nov 17, 2023

esteffin merged commit 3e2cbc5 into diffblue:develop Nov 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix failed coverage in if then else with return statements #8001

Fix failed coverage in if then else with return statements #8001

Uh oh!

esteffin commented Nov 7, 2023

Uh oh!

codecov bot commented Nov 7, 2023

Uh oh!

esteffin commented Nov 7, 2023

Uh oh!

kroening commented Nov 8, 2023

Uh oh!

thomasspriggs commented Nov 8, 2023

Uh oh!

tautschnig left a comment

Uh oh!

tautschnig Nov 17, 2023

Uh oh!

Uh oh!

Fix failed coverage in if then else with return statements #8001

Fix failed coverage in if then else with return statements #8001

Uh oh!

Conversation

esteffin commented Nov 7, 2023

Uh oh!

codecov bot commented Nov 7, 2023

Codecov Report

Uh oh!

esteffin commented Nov 7, 2023

Uh oh!

kroening commented Nov 8, 2023

Uh oh!

thomasspriggs commented Nov 8, 2023

Uh oh!

tautschnig left a comment

Choose a reason for hiding this comment

Uh oh!

tautschnig Nov 17, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!