-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HolSmt: fix Z3 proof replay in real arithmetic with rational coefficients #1188
Conversation
Oh, I just noticed that fun REAL_ARITH tm = RealArith.OLD_REAL_ARITH tm
handle HOL_ERR _ => RealField.REAL_ARITH tm; I'm sure you're aware of this, but for anyone who might not be aware,
I wonder why If the |
The reason that realLib's When |
Your current choice of using The interface provided by |
Thank you for the background info, that clarifies things for me! So it sounds like it'd be better to use I will fix the PR accordingly. Thanks! Edit: I just saw your latest comment. Now I am unsure again! |
There's no difference (on the number of solvable cases) for your two choices at this moment, because I don't think the extra problems that |
Ok, sorry, I was really confused for a moment.
Yes, I think those kind of problems (such as the one you mentioned) are already being handled by other proof rules (like So I will leave the PR as it is then. Thank you very much and sorry for the noise! |
OK, I check the code, the known cases that HOL-Light's latest version cannot solve were something like Latest HOL-Light:
HOL4:
HOL4 can do it only because I added one line (2361) in
This is mostly to make sure that unknown user proofs using the old |
(``0n <= 1n``, [thm_AUTO, thm_YO]), | ||
(``1n <= 0n``, [sat_CVC, sat_YO, sat_Z3, sat_Z3p]), | ||
(``(x:num) <= x``, [thm_AUTO, thm_YO]), | ||
(``(x:num) <= y ==> 2 * x <= 2 * y``, [thm_AUTO, thm_YO]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi. Is this problem ((x:num) <= y ==> 2 * x <= 2 * y
) supposed to be solvable only by Yices and not others? I wonder why you have removed the above 4 cases, although the 2nd one (1n <= 0m
) seems false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so there are a few things going on here:
I wonder why you have removed the above 4 cases
So, the 4 test cases are being removed because the same test cases are already being performed above:
- The original test cases:
Lines 486 to 489 in 5eb0e7b
(``0n <= 1n``, [thm_AUTO, thm_YO]), (``1n <= 0n``, [sat_CVC, sat_YO, sat_Z3, sat_Z3p]), (``(x:num) <= x``, [thm_AUTO, thm_YO]), (``(x:num) <= y ==> 42 * x <= 42 * y``, [thm_AUTO, thm_YO]), - The duplicate ones:
Lines 554 to 557 in 5eb0e7b
(``0n <= 1n``, [thm_AUTO, thm_YO]), (``1n <= 0n``, [sat_CVC, sat_YO, sat_Z3, sat_Z3p]), (``(x:num) <= x``, [thm_AUTO, thm_YO]), (``(x:num) <= y ==> 2 * x <= 2 * y``, [thm_AUTO, thm_YO]),
The duplicate ones are even in the wrong section (reals instead of nums). I suspect the duplicate ones might have been copy/pasted from the nums for reference, but someone forgot to delete them afterwards.
Is this problem
((x:num) <= y ==> 2 * x <= 2 * y)
supposed to be solvable only by Yices and not others?
Currently, yes. The HolSmt support for Yices 1 uses a different translation which supports natural numbers. However, SMT-LIB doesn't support natural numbers, only integers, so we currently translate the natural numbers into uninterpreted constants.
It is possible to translate the natural numbers to the integers in an equivalence-preserving way (like it was done in Isabelle or perhaps F*) but we don't do this yet. I intend to fix this in the future (if nobody else does it first), since this is an extremely important use-case for me.
However, I decided to work on proof reconstruction first (to become familiar with it) because I wanted to make sure that when we translate the natural numbers, that the proof reconstruction still works seamlessly -- i.e., I'd like to make sure that it also works well for the natural numbers without any ugly kludges or fragile workarounds, if possible.
the 2nd one (1n <= 0n) seems false.
It is indeed false. The HolSmt tests include tests for true statements (i.e. theorems) and false ones. When the SMT solvers can prove that the goal is false, they return sat
instead of unsat
(hence why that test case is tested with sat_CVC
, sat_Z3
and sat_Z3p
rather than thm_CVC
, thm_Z3
, and thm_Z3p
). In this case, it's easy for them to prove false statements about the natural numbers because they are allowed to assign whatever they want to uninterpreted constants, thus it's easy to assign something that makes the statement false.
This means that when we are forced to use uninterpreted constants or functions (which still happens a lot due to HolSmt limitations), a sat
answer may not always imply that the goal is false, but an unsat
answer always implies that the goal is true.
Hopefully, in the future we won't need to rely on uninterpreted symbols so much, making a sat
answer a much more reliable indicator that the goal is false.
In my to-do list, there's also a task which consists in clarifying this to the HolSmt user: when the SMT solver replies sat
, I'd like HolSmt to indicate to the user whether the sat
result may be spurious or not. And also, I'd like to make it clear that when a non-spurious sat
result is encountered, it indicates that the goal is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your answer. Now I found these duplicated test cases in other places of selftest.sml
, so you did the right thing (of removing duplications). P.S. In my to-do list, I will figure out why those tests (dedicated for Yices 1.x) failed and try to fix them.
Wait, I made a mistake with those test cases after all. The test cases are duplicated indeed, but now I think the correct solution is not to remove the duplicates, but rather to convert them to use reals instead of nums, so that the same pattern of tests is followed for both nums and reals. I will fix this PR soon to do that. Thank you for bringing this to my attention @binghe, I'd almost surely have missed this issue otherwise. It's easy to get lost in the sea of tests 😅 |
It turns out that the existing Z3 proof reconstruction code was using RealArith.REAL_ARITH to replay linear arithmetic steps in proofs. However, RealArith.REAL_ARITH only handles integral coefficients and therefore proof steps such as `(x: real) / 1r = x` were failing. To fix this, we simply switch to RealField.REAL_ARITH, which supports rational coefficients. This allows us to enable proof reconstruction for 3 more self-tests, all related to the reals. This also means that all real-related tests in the self-tests now work with proof reconstruction. The RealArith->RealField change in the "rewrite" rule handler fixes proof steps such as these: `(rewrite (= (/ v0 1.0) v0))` and allows us to reenable the first self-test. The identical change in the "th_lemma[arith]" rule handler fixes proof steps such as the following: ((_ |th-lemma| arith farkas -41/42 1) @x76 @x77 false) ... corresponding to: [.] |- ~(x <= 0r)), [.] |- 41r / 42r * x <= 0r], F ... which allows us to reenable proof reconstruction for the other two self-tests.
These test cases were almost surely intended to test the reals, since they are in the `reals` section and identical test cases covering the nums already exist in the `nums` section. Since cvc5 and Z3 can reason about the reals (unlike the nums, currently) we can enable them for these tests. This includes Z3 with proof reconstruction.
6aba5bc
to
7b32ff2
Compare
Update: instead of eliminating the test cases, I've changed them to cover the reals now, which is almost surely what was originally intended: to test these inequalities in the reals in the same way that they are being tested in the nums and the ints. Thanks! |
Thanks both! |
Sorry, while working on the newer Z3 proof reconstruction, I unexpectedly ran into another easy fix, hence this PR.
Namely, due to a comment in the self-tests I was under the impression that Z3 2.19 wasn't printing real constants correctly, but it turns out that it does (I suspect the issue may be with Z3 2.19.1 instead). Hence, it is possible to fix the proof reconstruction for the remaining real-related test cases.
After a quick investigation, it was easy to figure out the problem.
I will reproduce the initial part of the commit message here (see the commit for more details):
Another commit was also added that simply eliminates a few self-tests that were duplicated.
Edit: instead of removing the test cases, they were fixed to do what was almost surely intended, which is to test the reals in the same way the nums and ints are being tested.
Thanks!