Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix failed coverage in if then else with return statements #8001

Merged

Conversation

esteffin
Copy link
Contributor

@esteffin esteffin commented Nov 7, 2023

Fixes #7974

Avoid adding useless goto blocks in if statements with return statements.

When the parsed AST is transformed in GOTO language the if statements are translated in: if (!c) goto else; then_branch; goto endif; else: else_branch; endif: nop and if (!c) goto endif; then_branch; endif: nop.

This PR avoids to add the goto endif command and the endif label if the then_branch already ends with a return or a goto statement.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Sorry, something went wrong.

Enrico Steffinlongo added 2 commits November 7, 2023 16:19
When the parsed AST is transformed in GOTO language the if conditions
are translated in: `if (!c) goto else; then_branch; goto endif;
else: else_branch; endif: nop` and `if (!c) goto endif; then_branch;
endif: nop`.

This commit avoids to add the `goto endif` command and the `endif` label
if the `then_branch` ends with a `return` or a `goto` statement.

The benefit is that the `--cover location` command returns a
more precise result as unreachable code blocks that are added by CBMC's
frontend are not marked as unreachable.
This commit fixes the regression tests failing because of the
optimization to the way `if` statements are translated to GOTO.

Specifically it fixes
 - the expected result of `--cover location` in 2 `cbmc-cover`
   regression tests
 - the index of blocks in 3 regression tests from `cbmc` and `cprover`.
Copy link

codecov bot commented Nov 7, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (9238a38) 77.75% compared to head (701bf9c) 77.92%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #8001      +/-   ##
===========================================
+ Coverage    77.75%   77.92%   +0.17%     
===========================================
  Files         1701     1701              
  Lines       196465   196468       +3     
===========================================
+ Hits        152762   153101     +339     
+ Misses       43703    43367     -336     
Files Coverage Δ
src/goto-programs/goto_convert.cpp 88.13% <100.00%> (+0.03%) ⬆️

... and 97 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@esteffin
Copy link
Contributor Author

esteffin commented Nov 7, 2023

@kroening This PR involves some changes to a regression test of the cprover tool.

Can you please have a look that the changes are ok?

@kroening
Copy link
Member

kroening commented Nov 8, 2023

I am on the fence on this PR. It's of course true that the code after a return won't ever be covered. But where do we draw the line? Say why not include __noreturn functions? This eventually becomes the full program reachability problem, which is what the cover feature is meant to solve.

@thomasspriggs
Copy link
Contributor

I am on the fence on this PR. It's of course true that the code after a return won't ever be covered. But where do we draw the line? Say why not include __noreturn functions? This eventually becomes the full program reachability problem, which is what the cover feature is meant to solve.

@kroening I was on the fence as per my comments on #7974
However I have reconsidered since my initial comments, such that I am currently in favour of improving our coverage output based on avoiding introducing unreachable code in conversion to goto. I consider the unreachable goal to be unhelpful in this case because it does not correspond to a user statement. This is even clearer if you consider that the issue affects this variation on the input example without the brackets -

C example
int nondet_int();

int main() {
    if (nondet_int())
        return -1;
    else if (nondet_int())
        return -1;
    else
        return -1;
    return 0;
}
coverage output
** coverage results:
[main.coverage.1] file ./no_bracket_example.c line 4 function main block 1 (lines ./no_bracket_example.c:main:4): SATISFIED
[main.coverage.2] file ./no_bracket_example.c line 5 function main block 2 (lines ./no_bracket_example.c:main:5): SATISFIED
[main.coverage.3] file ./no_bracket_example.c line 5 function main block 3 (lines ./no_bracket_example.c:main:5): FAILED
[main.coverage.4] file ./no_bracket_example.c line 6 function main block 4 (lines ./no_bracket_example.c:main:6): SATISFIED
[main.coverage.5] file ./no_bracket_example.c line 7 function main block 5 (lines ./no_bracket_example.c:main:7): SATISFIED
[main.coverage.6] file ./no_bracket_example.c line 7 function main block 6 (lines ./no_bracket_example.c:main:7): FAILED
[main.coverage.7] file ./no_bracket_example.c line 9 function main block 7 (lines ./no_bracket_example.c:main:9): SATISFIED
[main.coverage.8] file ./no_bracket_example.c line 10 function main block 8 (lines ./no_bracket_example.c:main:10): FAILED
[main.coverage.9] file ./no_bracket_example.c line 11 function main block 9 (lines ./no_bracket_example.c:main:11): SATISFIED

** 6 of 9 covered (66.7%)

In this example there is no closing } for the then/true case which is unreachable as there is no code block in the source code, just a single statement.

This output also demonstrates that we are being inconsistent with how we are defining coverage. We have the helpful result in the case that the final return is unreachable (block 8). There are unhelpful goal results stating that the end of each "then true case" code block is unreachable (blocks 3 and 6). But we are missing the equivalent "unhelpful" result for the unreachable end of the "else false case" code block. Therefore the output appears to be inconsistent from a user/source perspective.

I would agree with you that we should not be partially solving the reachability problem by performing dead code elimination as part of the coverage instrumentation for example. However the way we report coverage at the moment either requires additional manual interpretation of the coverage results or mismatched goal elimination by a post-processing tool. Either solution seems potentially error-prone to me. Therefore I think it would be beneficial if we could do better in terms of our coverage reporting, rather than deferring to solutions outside of cbmc.

With regards to drawing the line between performing a full reachability analysis and improving the coverage output - I think I would currently draw the line at "we should not introduce dead code during during conversion which does not correspond to source language statements or source language branches". For a goal to really be useful it needs to map back to the source language in a way that makes sense. The change proposed in this PR gets us closer to that ideal in my opinion. With the way coverage instrumentation currently fits into the code base fixing this issue in the conversion is about as close as we can get to avoiding performing a reachability analysis to define reachability/coverage goals. The solution would look more like a reachability analysis the later we were to do it. There may be other ways the coverage analysis could be archectected in order to avoid issues like this one. But I don't see these longer term alternatives as a reason not to merge this PR, which is an improvement from a user perspective.

To refer back to the example of __noreturn function calls - my reading of the C standard is that this specifier is treated more as a statement of intent than a guarantee. "The implementation should produce a diagnostic message for a function declared with a _Noreturn function specifier that appears to be capable of returning to its caller." This seems to indicate a warning rather than an error on compiling __noreturn functions containing return statements/reaching the end. As such I don't think we should disregard the possibility of calls to __noreturn functions actually returning at the stage of conversion to goto.

@feliperodri feliperodri added aws Bugs or features of importance to AWS CBMC users aws-high labels Nov 13, 2023
Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is an improvement in the right place. It might not be perfect, but it does improve our code generation.

Comment on lines +1642 to +1648
// When the `then` branch of a balanced `if` condition ends with a `return` or
// a `goto` statement, it is not necessary to add the `goto z` and `z:` goto
// elements as they are never used.
// This helps for example when using `--cover location` as such command are
// marked unreachable, but are not part of the user-provided code to analyze.
bool then_branch_returns = dest.instructions.rbegin()->is_goto();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit pick: I would suggest to move this block just after CHECK_RETURN(!tmp_w.instructions.empty()); and use tmp_w instead of dest.

@esteffin esteffin merged commit 3e2cbc5 into diffblue:develop Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users aws-high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GOTO conversion of return inside if-then-else generates unreachable instructions
7 participants