Skip to content
This repository was archived by the owner on Oct 3, 2021. It is now read-only.

Tasks from PR #808 contain undefined behavior #813

Open
MartinSpiessl opened this issue Sep 10, 2019 · 10 comments
Open

Tasks from PR #808 contain undefined behavior #813

MartinSpiessl opened this issue Sep 10, 2019 · 10 comments

Comments

@MartinSpiessl
Copy link
Contributor

At least some of the tasks added in PR #808 contain overflows of signed integer variables, i.e. undefined behavior. Examples are divbin.c in line 30(multiplication of nondeterminisic signed integer with factor 2) and lcm1.c in line 31(multiplication of two nondeterministic signed integers).

I ran CPAchecker's overflow analysis for the tasks with nla-digibench-overflow.txt (had to name it *.txt because github does not allow uploading xml files -.-).

~/CPAchecker $ scripts/benchmark.py test/test/stet/nla-digibench-overflow.txt

The results with a timeout of 60 seconds look like this:

knuth.c                 ERROR (parsing failed)    1.61    1.63
freire1.c               unknown                   2.71    2.75
lcm2.c                  false(no-overflow)        2.87    2.91
prodbin.c               unknown                   2.98    3.00
bresenham.c             false(no-overflow)        2.92    2.97
freire2.c               unknown                   3.08    3.10
geo1.c                  false(no-overflow)        3.09    3.11
geo2.c                  false(no-overflow)        3.26    3.28
fermat1.c               false(no-overflow)        3.23    3.27
divbin.c                false(no-overflow)        3.68    3.72
hard.c                  false(no-overflow)        5.19    5.21
dijkstra.c              false(no-overflow)        7.33    7.35
cohendiv.c              false(no-overflow)       10.48   10.52
cohencu.c               false(no-overflow)       10.87   10.89
fermat2.c               false(no-overflow)        4.23    4.41
egcd3.c                 false(no-overflow)       35.42   35.45
ps6.c                   TIMEOUT                  60.79   60.82
ps4.c                   TIMEOUT                  60.84   60.87
lcm1.c                  TIMEOUT                  60.83   60.85
egcd.c                  TIMEOUT                  60.94   60.97
egcd2.c                 TIMEOUT                  60.84   60.86
ps3.c                   TIMEOUT                  61.05   61.08
prod4br.c               TIMEOUT                  60.89   60.93
ps5.c                   TIMEOUT                  61.06   61.09
sqrt1.c                 TIMEOUT                  61.06   61.11
ps2.c                   TIMEOUT                  61.07   61.12
mannadiv.c              TIMEOUT                  61.18   61.22
geo3.c                  TIMEOUT                  61.18   61.22

note that CPAchecker did not find the overflow in lcm1.c in time, so there could also be obvious overflows in tasks where CPAchecker did run into a timeout. knuth.c failed because it still has preprocessor directives in it, but it contains an overflow in line 31 as well (calculation of 8*n where n is a non-deterministic signed integer).

@nguyenthanhvuh
Copy link
Contributor

Hi @MartinSpiessl, thanks for checking these overflow problems. I am not sure how to fix them. For example,


Suggestion ?

@MartinSpiessl
Copy link
Contributor Author

There are two possible approaches:

  1. limit the range of values for the variables with __VERIFIER_assume
  2. change variable type to unsigned, since operations on unsigned variables are well defined in the C standard (wrap around on overflow as expected). In this case, It could of course happen that the program does not compute the right results anymore, i.e., the verdict might be wrong. So I would go with 1.

Here, it seems that the worst case is when B==A. 2*B should then still be less or equal than INT_MAX. So since INT_MAX is odd(2147483647 on 32bit):

A_MAX = (INT_MAX-1)/2 = 1073741823;

So probably something like:

__VERIFIER_assume(A<=1073741823);

should work. B does not have to be limited, since if it is bigger than a, it will never be doubled.
I might have made a mistake, so please check my reasoning. Also this is only considering positive numbers, I did not consider what happens if A is negative, which will likely trigger undefined behavior later as well.

@dbeyer
Copy link
Member

dbeyer commented Oct 11, 2019

@nguyenthanhvuh Would it be possible for you to file a new pull request for fixing this?

@nguyenthanhvuh
Copy link
Contributor

@dbeyer, yes, will create a PR in a couple of days.

@nguyenthanhvuh
Copy link
Contributor

@MartinSpiessl I've made some changes and created a PR #837. Could you take a look to see if these changes fix the issues? Thanks,

@MartinSpiessl
Copy link
Contributor Author

Sorry, I must have missed the notification for this issue.
I checked the files for which you submitted fixes in PR #837 :

11:26:20   knuth.i                 true                      3.56    1.30
11:26:20   divbin.i                true                      3.53    1.31
11:27:27   lcm1.c                  true                      3.43    1.26
11:27:27   lcm2.c                  false(no-overflow)        3.68    1.36

All fixes except lcm2.c seem to work. That one still has an overflow in line 31 with the following data state:

a: 32769
b: 65535
x: 32769
y: 65535
u: 65535
v: 32769

Looks like the overflow is caused by:

x*u + y*v = 32769*65535+65535*32769 = 4295032830 > INT_MAX

a and b are now unsigned, but the implicit type conversions to unsigned int will only happen after the result of x*u + y*v is calculated, which is of type (signed) int. Since the value calculated is also bigger than UINT_MAX, I am not so sure whether this program will still make sense if we just change x,u,y,v to unsigned int. Probably reducing the bound for the variable values further is neccessary.

@timosantonopoulos
Copy link
Contributor

Hi, the bound for the variables was reduced to half what it was before. Does this resolve the issue?

@MartinSpiessl
Copy link
Contributor Author

I do not see any updates on lcm2.c. If I reduce the bound in lcm2.c to (65/2) manually then yes, I am not able to find an overflow anymore (this is no proof that the tasks is overflow-free, so ideally the tasks get a verdict true_nooverflow added and are added to the overflow category, this way some tool might be able to come up with a proof or show a counterexample).

@timosantonopoulos
Copy link
Contributor

timosantonopoulos commented Nov 22, 2019

Hi, the updates I was referring to can be seen at: https://github.com/nguyenthanhvuh/sv-benchmarks/blob/master/c/nla-digbench/lcm2.c where the variables a and b are bounded to 32768. Would that work?

@MartinSpiessl
Copy link
Contributor Author

No, that doesn't work, off by one! With both upper values set to 32767 there is however no overflow anymore, so change 32768 to 32767 and it should be fine 👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

No branches or pull requests

4 participants