Updated readme

cmu-transparency · May 1, 2023 · 6c680d5 · 6c680d5
1 parent 6b5b724
commit 6c680d5
Show file tree

Hide file tree

Showing 3 changed files with 256 additions and 119 deletions.
diff --git a/README.md b/README.md
@@ -7,11 +7,6 @@ The core of the checker is written in [Why3](https://why3.lri.fr/), which is ext
 * The current implementation is not optimized, and will be considerably slower than [DRAT-trim](https://github.com/marijnheule/drat-trim) on large proofs (see [performance](#performance) below).
 * Accordingly, the frontend does not accept proofs in binary format.
 
-The verification can be checked by running `src/librupchecker/rup_pure.mlw` in Why3. 
-Most of the verification conditions complete with the `Auto level 0` tactic, and the rest either with a few levels of splitting followed by `Auto 0` or `Auto 1`, or simply with `Auto 2`.
-It was developed using Why3 1.5.1, Alt-Ergo 2.4.0, Z3 4.8.6, and CVC4 1.8.
-Verification has not been attempted with earlier versions of Why3 or the provers.
-
 ## Installation
 
 If you use a recent Linux distribution on x86_64, you should be able to install the compiled wheel from PyPI:
@@ -62,52 +57,46 @@ For more information visit https://github.com/cmu-transparency/verified_rup
 
 ### As a Python module
 
-See the documentation for details of the API.
-The primary function is `drup.check_proof`, or alternatively, `drup.check_derivation` to check each step of the proof, ignoring the absence of an empty clause).
-```python
-def check_proof(formula : Cnf, proof : Proof, verbose : bool = False) -> CheckerResult:
-  """Check a sequence of RUP and RAT clauses against a CNF. Inputs are Python iterables
-  of clauses, where each clause is an iterable of signed Python ints.
-
-  Args:
-    formula (Cnf): Cnf as an iterable of clauses.
-    proof (Proof): Iterable of RUP or RAT clauses.
-    verbose (bool, optional): Return detailed information
-      if the check fails. Defaults to False.
-
-  Returns:
-    CheckerResult: CheckerResult struct representing the result of the check.
-  
-  Raises:
-    ValueError: If the formula or proof cannot be formatted.
-  """
-```
-This takes a CNF and Proof as an iterable of iterables of signed integers, and returns a `CheckerResult`.
-```python
-class CheckerResult:
-
-  '''
-  Result of a proof check.
+See the [documentation](http://fairlyaccountable.org/verified_rup/drup.html) for details of the API.
+The primary function is `drup.check_proof`, or alternatively, `drup.check_derivation` to check each step of the proof, ignoring the absence of an empty clause). There are corresponding convenience functions `check_proof_from_strings` and `check_proof_from_files`, similarly for `check_derivation`.
 
-  Attributes:
+The following example uses [CNFgen](https://massimolauria.net/cnfgen/) to generate a PHP instance,
+and [PySAT](https://pysathq.github.io/) to solve it and generate a DRUP proof.
+To illustrate the verbose output given for failed checks, only the first ten clauses of the proof are checked against the proof.
 
-    outcome (`Outcome`): The outcome of the check. If the check
-      succeeded, this will be Outcome.VALID. If the check failed,
-      this will be Outcome.INVALID.
-
-    steps (`Optional[Cnf]`): Completed proof steps prior to an invalid step, 
-      if the proof was invalid.
+```python
+import drup
+import cnfgen
+from pysat.formula import CNF
+from pysat.solvers import Solver
+
+dimacs = cnfgen.BinaryPigeonholePrinciple(4, 3).dimacs()
+cnf = CNF(from_string=dimacs).clauses
+g4 = Solver(name='g4', with_proof=True, bootstrap_with=cnf)
+g4.solve()
+proof = [[int(l) for l in c.split(' ')[:-1]] for c in g4.get_proof()]
+
+drup.check_proof(cnf[:10], proof, verbose=True)
+```
 
-    rup_info (`Optional[RupInfo]`): Information on a failed RUP check,
-      if the proof was invalid.
+This gives a `CheckerResult` object with the following information:
 
-    rat_info (`Optional[RatInfo]`): Information on a failed RAT check,
-      if the proof was invalid. The RAT clause in this object will
-      be the same as the RUP clause in `rup_info`.
-  '''
+```python
+CheckerResult(
+	Outcome.INVALID, 
+	[], 
+	RupInfo([4, 6, 7, 8], [-4, -6, -7, -8, 3, 5]), 
+	RatInfo(
+		[4, 6, 7, 8], 
+		[-4, -3], 
+		RupInfo([6, 7, 8, -3], [-6, -7, -8, 5, 3, -4])
+	)
+)
 ```
-There are corresponding convenience functions `check_proof_from_strings` and `check_proof_from_files`, similarly for `check_derivation`.
 
+The `RupInfo` component relates that RUP verification failed on the first clause of the proof, `4 6 7 8 0`, after propagating the literals `-4, -6, -7, -8, 3, 5`, and failing to find more opposites to propagate.
+Likewise, the `RatInfo` component relates that RAT verification on this clause failed when checking the pivot on clause `-4 -3 0`.
+The resolvent of these clauses, `6 7 8 -3 0`, failed RUP verification after propagating `-6 -7 -8 5 3 -4`.
 
 ## Performance
 
@@ -132,6 +121,82 @@ In the table below, each configuration is run on 10,000 instances, with proofs g
 
 [[1]](https://openreview.net/forum?id=HJMC_iA5tm) Daniel Selsam, Matthew Lamm, Benedikt Bünz, Percy Liang, Leonardo de Moura, David L. Dill. *Learning a SAT Solver from Single-Bit Supervision*. International Conference on Learning Representations (ICLR), 2019.
 
+## Verification
+
+The verification can be examined by running `src/librupchecker/rup_pure.mlw` in Why3, or by checking the Why3 session in `src/librupchecker/rup_pure/why3session.xml`. 
+The proof was developed using Why3 1.5.1, Alt-Ergo 2.4.0, Z3 4.8.6, and CVC4 1.8.
+Verification has not been attempted with earlier versions of Why3 or the provers.
+
+The primary contract on the proof checker is as follows:
+```ocaml
+requires { no_redundant_clauses cnf /\ no_trivial_clauses cnf }
+requires { no_redundant_clauses pf /\ no_trivial_clauses pf }
+ensures { match result with
+			| Valid -> valid_proof cnf pf
+			| _ -> proof_failure orig result
+			end }
+```
+The bindings used by this library take care of removing redundant and trivial clauses.
+The `valid_proof` predicate is a straightforward translation of DRAT certification requirements.
+A `proof_failure` result provides additional assurances about the verbose output of the checker.
+```ocaml
+predicate proof_failure (cnf : cnf) (result : result) =
+    match result with
+    | Valid -> false
+    | InvalidEmpty steps rup_info -> rup_failure (steps ++ cnf) rup_info
+    | InvalidStep steps rup_info rat_info -> 
+        rup_failure (steps ++ cnf) rup_info /\ rat_failure (steps ++ cnf) rat_info
+    end
+```
+An `InvalidEmpty` result indicates all of the listed `steps` are valid, but the empty clause was not derived, as witnessed by the provided `rup_info`.
+This is only returned when all of the steps in the proof are valid except an empty clause at the end.
+The `rup_info` component ensures that the empty clause is not RUP, and that the unit chain used to conclude this is exhaustive.
+
+```ocaml
+predicate rup_failure (cnf : cnf) (info : rup_info) =
+    not (rup cnf info.rup_clause) /\
+    match info.rup_clause with
+    | Nil -> 
+        let cnf' = bcp cnf info.chain in
+        is_unit_chain cnf info.chain /\ 
+        forall chain' . is_unit_chain cnf' chain' -> 
+                        forall c . mem c (bcp cnf' chain') <-> mem c cnf'
+    | Cons _ _ -> 
+        let cnf' = bcp (cnf ++ (negate_clause info.rup_clause)) info.chain in
+        is_unit_chain (cnf ++ (negate_clause info.rup_clause)) info.chain /\ 
+        forall chain' . is_unit_chain cnf' chain' -> 
+                        forall c . mem c (bcp cnf' chain') <-> mem c cnf'
+    end
+```
+
+An `InvalidStep` result indicates that the `steps` are valid up to some non-empty step.
+The next step in the certificate following `steps` is invalid, as witnessed by the provided `rup_info` and `rat_info`.
+In addition to the assurances on `rup_info` described above, `rat_info` provides that the identified pivot clause is not RUP.
+
+```ocaml
+predicate rat_failure (cnf : cnf) (info : rat_info) =
+    not (rat cnf info.rat_clause) /\
+    match info.rat_clause with
+    | Nil -> false
+    | Cons l _ ->
+        mem info.pivot_clause (pivot_clauses cnf l) /\
+        info.pivot_info.rup_clause = resolve info.rat_clause info.pivot_clause l /\
+        rup_failure cnf info.pivot_info
+    end
+```
+
+The derivation checker provides a similar contract, but rather than ensuring `valid_proof` on success, it provides `valid_derivation`.
+
+```ocaml
+predicate valid_derivation (cnf : cnf) (pf : proof) =
+    match pf with
+    | Nil -> true
+    | Cons c cs -> (rup cnf c \/ rat cnf c) /\ valid_derivation (Cons c cnf) cs
+    end
+```
+
+This is the same condition as `valid_proof`, but in the `Nil` case, the checker does not require that the empty clause is not RUP.
+
 ## Acknowledgements
 
 Many thanks to [Frank Pfenning](http://www.cs.cmu.edu/~fp/), [Joseph Reeves](https://www.cs.cmu.edu/~jereeves/), and [Marijn Huele](https://www.cs.cmu.edu/~mheule/) for the ongoing insightful discussions that led to this project.