Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HolSmt: fix Z3 proof replay in real arithmetic with rational coefficients #1188

Merged
merged 2 commits into from
Jan 28, 2024

Conversation

someplaceguy
Copy link
Contributor

@someplaceguy someplaceguy commented Jan 26, 2024

Sorry, while working on the newer Z3 proof reconstruction, I unexpectedly ran into another easy fix, hence this PR.

Namely, due to a comment in the self-tests I was under the impression that Z3 2.19 wasn't printing real constants correctly, but it turns out that it does (I suspect the issue may be with Z3 2.19.1 instead). Hence, it is possible to fix the proof reconstruction for the remaining real-related test cases.

After a quick investigation, it was easy to figure out the problem.

I will reproduce the initial part of the commit message here (see the commit for more details):

It turns out that the existing Z3 proof reconstruction code was using
RealArith.REAL_ARITH to replay linear arithmetic steps in proofs.

However, RealArith.REAL_ARITH only handles integral coefficients and
therefore proof steps such as (x: real) / 1r = x were failing.

To fix this, we simply switch to RealField.REAL_ARITH, which supports
rational coefficients.

This allows us to enable proof reconstruction for 3 more self-tests,
all related to the reals. This also means that all real-related tests
in the self-tests now work with proof reconstruction.

Another commit was also added that simply eliminates a few self-tests that were duplicated.

Edit: instead of removing the test cases, they were fixed to do what was almost surely intended, which is to test the reals in the same way the nums and ints are being tested.

Thanks!

@someplaceguy
Copy link
Contributor Author

someplaceguy commented Jan 26, 2024

Oh, I just noticed that realLib.REAL_ARITH also exists:

  fun REAL_ARITH tm      = RealArith.OLD_REAL_ARITH tm
                           handle HOL_ERR _ => RealField.REAL_ARITH tm;

I'm sure you're aware of this, but for anyone who might not be aware, RealArith.OLD_REAL_ARITH is described in src/real/selftest.sml as:

The original version and old port by Hurd

I wonder why realLib's version first tries the old port and only if it fails, it tries the new one. Is it because the old port is faster (or solves different goals)?

If the realLib version is preferred, I can switch the code to use realLib.REAL_ARITH rather than RealField.REAL_ARITH directly.

@binghe
Copy link
Member

binghe commented Jan 26, 2024

The reason that realLib's REAL_ARITH first tries the old port and then the new one (after #1043), is because the old port can solve some (less interesting) inputs that the new port (same as HOL-Light's current version) cannot. To strictly align with HOL-Light's REAL_ARITH (for porting proofs from HOL-Light to HOL4), I chose to not (this is not actually true, as further explained in later comments, and the old port is 2x-3x faster) further enhance the new port to cover those (known) extra inputs.

When HolSmt was written, HOL4's REAL_ARITH was the old port, and the original author of HolSmt has only REAL_ARITH with integral coefficients. I'm glad to see that the new port of REAL_ARITH can also enrich the set of solvable cases of HolSmt.

@binghe
Copy link
Member

binghe commented Jan 26, 2024

Your current choice of using RealField.REAL_ARITH is good, I think. It's slightly slower but given the HolSmt code has no ability to actively choose different versions (by checking the coefficients), it has no choice but to always use the full version.

The interface provided by realLib is basically for users who doesn't care which version to use.

@someplaceguy
Copy link
Contributor Author

someplaceguy commented Jan 26, 2024

Thank you for the background info, that clarifies things for me!

So it sounds like it'd be better to use realLib rather than RealField, since that would allow HolSmt to reconstruct more proofs.

I will fix the PR accordingly. Thanks!

Edit: I just saw your latest comment. Now I am unsure again!

@binghe
Copy link
Member

binghe commented Jan 26, 2024

Thank you for the background info, that clarifies things for me!

So it sounds like it'd be better to use realLib rather than RealField, since that would allow HolSmt to reconstruct more proofs.

I will fix the PR accordingly. Thanks!

Edit: I just saw your latest comment. Now I am unsure again!

There's no difference (on the number of solvable cases) for your two choices at this moment, because I don't think the extra problems that OLD_REAL_ARITH can solve, may appear inside HolSmt. Those "extra" problems solvable by OLD_REAL_ARITH are something like |- f (x + x) = f (2 * x), where f is an arbitrary uninterpreted function irrelevant to reals. Basically, by doing some extra rewritings, the new REAL_ARITH can also solve them.

@someplaceguy
Copy link
Contributor Author

someplaceguy commented Jan 26, 2024

Ok, sorry, I was really confused for a moment.

Those "extra" problems solvable by OLD_REAL_ARITH are something like |- f (x + x) = f (2 * x), where f is an arbitrary uninterpreted function irrelevant to reals.

Yes, I think those kind of problems (such as the one you mentioned) are already being handled by other proof rules (like monotonicity, in this case), so I think RealField should be enough, as you say. And of course, if it turns out not to be the case, it can always be fixed later if necessary.

So I will leave the PR as it is then.

Thank you very much and sorry for the noise!

@binghe
Copy link
Member

binghe commented Jan 26, 2024

Ok, sorry, I was really confused for a moment.

Those "extra" problems solvable by OLD_REAL_ARITH are something like |- f (x + x) = f (2 * x), where f is an arbitrary uninterpreted function irrelevant to reals.

Yes, I think those kind of problems (such as the one you mentioned) are already being handled by other proof rules (like monotonicity, in this case), so I think RealField should be enough, as you say. And of course, if it turns out not to be the case, it can always be fixed later if necessary.

So I will leave the PR as it is then.

Thank you very much and sorry for the noise!

OK, I check the code, the known cases that HOL-Light's latest version cannot solve were something like |- closed {x | 1 * x = a} <=> closed {x | x = a}, found when building core library with the new port. Then I put some extra code into RealArith.sml (at line 2361) to make sure such cases can also be solved. But there's no way to know what else problems that the old port can do but the new one cannot, because their implementations are completely separate, no shared code at all.

Latest HOL-Light:

# REAL_ARITH `{x | 1 * x = a} = {x | x = a}`;;
Exception: Failure "linear_ineqs: no contradiction".

HOL4:

> RealArith.REAL_ARITH ``{x | 1 * x = a} = {x | x = a}``;
<<HOL message: more than one resolution of overloading was possible>>
val it = |- {x | 1 * x = a} = {x | x = a}: thm

HOL4 can do it only because I added one line (2361) in RealArith.sml with some historical notes:

    (* NOTE: this step of POLY_CONV helps by cutting off real arith terms
       hidding in propositional terms, e.g. ‘closed {x | 1 * x = a}’ will
       be simplified to ‘closed {x | x = a}’ before going to NNF_CONV.
       Some HOL4 proofs rely on this. *)
    TOP_DEPTH_CONV POLY_CONV THENC

This is mostly to make sure that unknown user proofs using the old REAL_ARITH continue to work, although the proof is easily fixable.

(``0n <= 1n``, [thm_AUTO, thm_YO]),
(``1n <= 0n``, [sat_CVC, sat_YO, sat_Z3, sat_Z3p]),
(``(x:num) <= x``, [thm_AUTO, thm_YO]),
(``(x:num) <= y ==> 2 * x <= 2 * y``, [thm_AUTO, thm_YO]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi. Is this problem ((x:num) <= y ==> 2 * x <= 2 * y) supposed to be solvable only by Yices and not others? I wonder why you have removed the above 4 cases, although the 2nd one (1n <= 0m) seems false.

Copy link
Contributor Author

@someplaceguy someplaceguy Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so there are a few things going on here:

I wonder why you have removed the above 4 cases

So, the 4 test cases are being removed because the same test cases are already being performed above:

  • The original test cases:

    HOL/src/HolSmt/selftest.sml

    Lines 486 to 489 in 5eb0e7b

    (``0n <= 1n``, [thm_AUTO, thm_YO]),
    (``1n <= 0n``, [sat_CVC, sat_YO, sat_Z3, sat_Z3p]),
    (``(x:num) <= x``, [thm_AUTO, thm_YO]),
    (``(x:num) <= y ==> 42 * x <= 42 * y``, [thm_AUTO, thm_YO]),
  • The duplicate ones:

    HOL/src/HolSmt/selftest.sml

    Lines 554 to 557 in 5eb0e7b

    (``0n <= 1n``, [thm_AUTO, thm_YO]),
    (``1n <= 0n``, [sat_CVC, sat_YO, sat_Z3, sat_Z3p]),
    (``(x:num) <= x``, [thm_AUTO, thm_YO]),
    (``(x:num) <= y ==> 2 * x <= 2 * y``, [thm_AUTO, thm_YO]),

The duplicate ones are even in the wrong section (reals instead of nums). I suspect the duplicate ones might have been copy/pasted from the nums for reference, but someone forgot to delete them afterwards.

Is this problem ((x:num) <= y ==> 2 * x <= 2 * y) supposed to be solvable only by Yices and not others?

Currently, yes. The HolSmt support for Yices 1 uses a different translation which supports natural numbers. However, SMT-LIB doesn't support natural numbers, only integers, so we currently translate the natural numbers into uninterpreted constants.

It is possible to translate the natural numbers to the integers in an equivalence-preserving way (like it was done in Isabelle or perhaps F*) but we don't do this yet. I intend to fix this in the future (if nobody else does it first), since this is an extremely important use-case for me.

However, I decided to work on proof reconstruction first (to become familiar with it) because I wanted to make sure that when we translate the natural numbers, that the proof reconstruction still works seamlessly -- i.e., I'd like to make sure that it also works well for the natural numbers without any ugly kludges or fragile workarounds, if possible.

the 2nd one (1n <= 0n) seems false.

It is indeed false. The HolSmt tests include tests for true statements (i.e. theorems) and false ones. When the SMT solvers can prove that the goal is false, they return sat instead of unsat (hence why that test case is tested with sat_CVC, sat_Z3 and sat_Z3p rather than thm_CVC, thm_Z3, and thm_Z3p). In this case, it's easy for them to prove false statements about the natural numbers because they are allowed to assign whatever they want to uninterpreted constants, thus it's easy to assign something that makes the statement false.

This means that when we are forced to use uninterpreted constants or functions (which still happens a lot due to HolSmt limitations), a sat answer may not always imply that the goal is false, but an unsat answer always implies that the goal is true.

Hopefully, in the future we won't need to rely on uninterpreted symbols so much, making a sat answer a much more reliable indicator that the goal is false.

In my to-do list, there's also a task which consists in clarifying this to the HolSmt user: when the SMT solver replies sat, I'd like HolSmt to indicate to the user whether the sat result may be spurious or not. And also, I'd like to make it clear that when a non-spurious sat result is encountered, it indicates that the goal is false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your answer. Now I found these duplicated test cases in other places of selftest.sml, so you did the right thing (of removing duplications). P.S. In my to-do list, I will figure out why those tests (dedicated for Yices 1.x) failed and try to fix them.

@someplaceguy someplaceguy marked this pull request as draft January 26, 2024 20:39
@someplaceguy
Copy link
Contributor Author

someplaceguy commented Jan 26, 2024

Wait, I made a mistake with those test cases after all. The test cases are duplicated indeed, but now I think the correct solution is not to remove the duplicates, but rather to convert them to use reals instead of nums, so that the same pattern of tests is followed for both nums and reals.

I will fix this PR soon to do that.

Thank you for bringing this to my attention @binghe, I'd almost surely have missed this issue otherwise. It's easy to get lost in the sea of tests 😅

It turns out that the existing Z3 proof reconstruction code was using
RealArith.REAL_ARITH to replay linear arithmetic steps in proofs.

However, RealArith.REAL_ARITH only handles integral coefficients and
therefore proof steps such as `(x: real) / 1r = x` were failing.

To fix this, we simply switch to RealField.REAL_ARITH, which supports
rational coefficients.

This allows us to enable proof reconstruction for 3 more self-tests,
all related to the reals. This also means that all real-related tests
in the self-tests now work with proof reconstruction.

The RealArith->RealField change in the "rewrite" rule handler fixes
proof steps such as these: `(rewrite (= (/ v0 1.0) v0))` and allows
us to reenable the first self-test.

The identical change in the "th_lemma[arith]" rule handler fixes
proof steps such as the following:

((_ |th-lemma| arith farkas -41/42 1) @x76 @x77 false)

... corresponding to:

[.] |- ~(x <= 0r)),  [.] |- 41r / 42r * x <= 0r], F

... which allows us to reenable proof reconstruction for the other two
self-tests.
These test cases were almost surely intended to test the reals, since
they are in the `reals` section and identical test cases covering the
nums already exist in the `nums` section.

Since cvc5 and Z3 can reason about the reals (unlike the nums,
currently) we can enable them for these tests. This includes Z3 with
proof reconstruction.
@someplaceguy someplaceguy marked this pull request as ready for review January 26, 2024 21:46
@someplaceguy
Copy link
Contributor Author

Update: instead of eliminating the test cases, I've changed them to cover the reals now, which is almost surely what was originally intended: to test these inequalities in the reals in the same way that they are being tested in the nums and the ints.

Thanks!

@mn200 mn200 merged commit a423628 into HOL-Theorem-Prover:develop Jan 28, 2024
2 checks passed
@mn200
Copy link
Member

mn200 commented Jan 28, 2024

Thanks both!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants