Proper ZK treatment in `plonky2` #1625

Nashtare · 2024-09-10T21:41:23Z

@LindaGuiga I'm opening a draft PR to be able to comment on the code

plonky2/src/batch_fri/recursive_verifier.rs

plonky2/src/batch_fri/verifier.rs

plonky2/src/fri/mod.rs

plonky2/src/fri/oracle.rs

plonky2/src/fri/recursive_verifier.rs

plonky2/src/plonk/proof.rs

plonky2/src/plonk/prover.rs

plonky2/src/util/serialization/mod.rs

LindaGuiga · 2024-09-17T11:40:27Z

This PR aims at addressing #1625, based on this note https://eprint.iacr.org/2024/1037.pdf .

For the batch FRI polynomial, we take a random polynomial with twice the degree of the subgroup, so that we can add a FRI step with arity 2 instead of computing 2 different FRI proofs (for the lower half and higher half of the polynomial, as mentioned in the note).
For the quotient polynomial chunks, we reduce the degree n of each chunk to n - h, so that we can add to them random polynomials with degree n.
For the third point in Proper ZK treatment in plonky2 #1625, the current implementation to randomize the wire polynomials seems to follow the guidelines in the paper. Indeed, currently, the degree h is computed as: h_1 = D + num_fri_openings for wire polynomials and h_2 = 2 * D + num_fri_openings for the permutation polynomial Z where D is the extension degree. The differnece between the two values is because the wire polynomials are openings are only opened at zeta, while the Z polynomial is also opened at g*Z. h_1 is added to all wire polynomials, while h_2 is only added to the routed wires. This is in accordance with the prescription of having h >= 2 * (D * n_DEEP + n_FRI) (Eq. 13) in the paper, for the case where the quotient chunks are computed the canonical way and randomized. (Note that the factor 2 in Eq 13 comes from the evaluation at zeta and g*zeta, but we only evaluate at zeta for the wire polynomials, as explained before.)

4l0n50

Just did a first pass and mostly pointed out nits.

plonky2/src/fri/mod.rs

plonky2/src/batch_fri/oracle.rs

plonky2/src/batch_fri/verifier.rs

plonky2/src/fri/oracle.rs

plonky2/src/plonk/prover.rs

4l0n50 · 2024-09-27T13:50:38Z

For the batch FRI polynomial, we take a random polynomial with twice the degree of the subgroup, so that we can add a FRI step with arity 2 instead of computing 2 different FRI proofs (for the lower half and higher half of the polynomial, as mentioned in the note).

I guess this is what the note meant (Protocol 2, isn't it?), because why doing two proofs if you can batch them and compute only one proof? And batching them is like doing FRI for the large poly.

LindaGuiga · 2024-10-11T00:04:57Z

After an initial review by @ulrich-haboeck, it came out that the random R polynomial does not actually need to have a higher degree than the batch FRI polynomial. Indeed, the randomization of wires is done using some space in the witness domain, which means that h is "included" in the batch FRI polynomial already.

Moreover, he also mentioned that num_blinding_gates (similarly in computed_h) can be updated to only include num_fri_queries instead of num_fri_openings thanks to the new batch FRI polynomial randomization.

I therefore updated the implementation to include both changes.

ulrich-haboeck

Here's my feedback:

We would improve proof sizes by gathering all round-2 polynomials, i.e. the partial products from the permutation argument, the pole sums and the table authenticator sum for the lookup argument, the components of the quotient polynomial, and the zk masking polynomial for batch-FRI round-3 polynomials, the components of the quotient polynomial, and the zk masking polynomial for batch-FRI.
I was not able to find out whether the lookup argument polynomials from the second round (for the "pole sums" over table and witness area) are randomized. If not, we need to do this by expanding the randomization of "regular" polynomial (and likewise) to auxiliary ones.
Due to the treatment of the permutation argument, the current implementation is only statistical zero-knowledge:
In order that the prover is able to craft a valid proof, the round-1 verifier challenges (beta, gamma) must not produce a zero in one of the partial products. Hence, with each valid proof the verifier learns a little piece of information on the witnesses, namely all linear factors in the (virtual) permutation argument polynomial

Sigma(X,Y) = \prod_{i,x} (X - x - w_i(x) * Y),

where the product ranges over all wired columns of the chip, are non-zero at (beta, gamma). (Funnily, this is also the case for Plonk's randomization, see footnote 10 on zero-knowledge in the Plonk paper.)

In my opinion, perfect zero-knowledge would be a nice feature. But that would come at a certain extra cost:

The technique from the mir blog would need to be replaced by a strategy that works for every (beta, gamma).
A naive approach would cost the double of auxilary columns, by proving

\prod_{i,x} (beta - x - w_i(x) * gamma),

and

\prod_{i,x} (beta - sigma_{i}(x) - w_i(x) * gamma)

via separate running products, with their start values enforced to be 1, and their end values enforced to be equal.
Randomization of the partial products can be done by regular noop-gates plus a selector for the permutation argument (excluding the zk area on the chip), or by multiples of the domain vanishing polynomial. The latter randomization allows to keep the same number of columns per "partial lookup", if one implements a "greedy" evaluation logic of their constraints.

Although not strictly needed, it would be good practice to do proper error handling for the case when the lookup random challenge (alpha, ChallangeA) hits a zero of the (virtual) table polynomial

t(X,Y) = \prod_i (X - t_{i,0} - t_{i,1}* Y),

where the product ranges over all table entries t_i=(t_{i,0}, t_{i,1}) of the functional relation to be looked up.

Sorry for the bad formatting - markdown is a pain.

plonky2/src/plonk/circuit_builder.rs

plonky2/src/plonk/prover.rs

ulrich-haboeck · 2024-10-15T12:35:35Z

plonky2/src/fri/oracle.rs

            alpha.shift_poly(&mut final_poly);
            final_poly += quotient;


Same here, as in the batch FRI oracle. Shouldn't we shift quotient?

plonky2/src/fri/prover.rs

plonky2/src/fri/validate_shape.rs

plonky2/src/fri/verifier.rs

ulrich-haboeck · 2024-10-15T12:44:02Z

plonky2/src/fri/verifier.rs

+            .map(|p| (p.oracle_index == PlonkOracle::R.index) as usize)
+            .sum();
+        let last_poly = polynomials.len() - nb_r_polys * (idx == 0) as usize;
+        let evals = polynomials[..last_poly]
            .iter()
            .map(|p| {
                let poly_blinding = instance.oracles[p.oracle_index].blinding;


Actually, for the line below: what is the purpose of the && here?

I am unsure (I did not change this code), but I assume this is to allow us to have polynomial that we do not necessarily want to blind?

ulrich-haboeck · 2024-10-15T15:53:30Z

Spending another thought on perfect zero-knowledge, the following weakened constraints on a running product should suffice. For simplicity, I explain it for a single-column Plonk with a single witness column w(X) and a single pre-computed polynomial sigma(X) for the permutation of the witness domain H. (A generalization to a meaningful chip is straight-forward.)
The usual constraints for the running product Z(X) are Z(1) = 1 and

Z(g*x) * (beta - x - w(x)*gamma) = Z(x) * (beta - sigma(x) - w(x)*gamma)

for every x in H. (Again, these are in general not satisfiable at the wrap-around point, if one of the linear factors beta - x - w(x)*gamma is zero. ) To allow the prover succeed even when one of these factors are zero, we weaken the domain of the above constraint to H \ {g^{-1}}, i.e. the witness domain except the wrap-around point g^{- 1}, and instead demand that last value of the running sum is either zero, or one. Since

last_val * (beta - x - w(x)*gamma) = Z(x) * (beta - sigma(x) - w(x)*gamma)

with x= g^{-1}, we can enforce this by demanding that

Z(x) * (beta - sigma(x) - w(x)*gamma)  
    * [ Z(g*x) * (beta - x - w(x)*gamma) - Z(x) * (beta - sigma(x) - w(x)*gamma) ] = 0

at x = g^{-1}, resulting in a degree 5 constraint, including selector.

ulrich-haboeck · 2024-10-15T19:21:49Z

Also, we can take the same random (beta, gamma) for all three arguments in round 2: The permutation argument, the lookup argument, and for proving the table authenticator t(beta, gamma).

ulrich-haboeck · 2024-10-16T15:49:04Z

Actually, there is a gap in the above constraints, which still lets the prover not succeed in certain cases. Notably, this gap also occurs in the Halo2 book (thanks to @Al-Kindi-0 for the reference, and also for proposing the countermeasure) :
Again in the single-column case, assume that one of the linear terms is zero, and that term occurs first (when x goes through 1, g, g^2,... ) on the Z(gx) side of the transitional constraint, i.e.

                  = 0
        __________/\__________
Z(gx) * (alpha - x - beta*w(x)) -  Z(x) * (alpha - sigma(x) - beta*w(x)) = 0.

While this leaves Z(gx) undetermined, it enforces Z(x) to be zero (unless sigma(x) = x) at that x, and consequently must be zero at all points before, down to x=1, conflicting the demanded constraint Z(1)=1. To patch this, we additional use the linear term (alpha - x - beta*w(x)) for muting the constraint,

 (alpha - x - beta*w(x)) 
    * [Z(gx) * (alpha - x - beta*w(x)) -  Z(x) * (alpha - sigma(x) - beta*w(x))] = 0,

for all x in H \{g^{-1}}, resulting in a degree 4 constraint (including selector).

Let me point out, that the solution described here is tailored for AIRs, and hopefully can be further optimized.
In the case of Plonk, where we can randomize "on chip" using noop gates, one can additionally reduce the degree for the end value, by enforcing Z(x), for x at the boundary of the zk area, to carry the final value of the product.

ulrich-haboeck · 2024-10-16T16:39:46Z

Corrected a mistaken comment on the common Merkle root for each round. Round 2 is fine as implemented (gathering the permutation and lookup argument polynomials), but Round 3 gives us the opportunity to put the masking polynomial R(X) and the quotient components under the same Merkle root. Not sure if this is implemented that way.

LindaGuiga · 2024-10-18T10:40:22Z

I was not able to find out whether the lookup argument polynomials from the second round (for the "pole sums" over table and witness area) are randomized. If not, we need to do this by expanding the randomization of "regular" polynomial (and likewise) to auxiliary ones.

I'm sorry, I'm not sure I understand what you mean here. What are the "regular" and "auxiliary" polynomials here? Do you mean we should randomize the SLDC polynomials in some way?

ulrich-haboeck · 2024-10-18T11:51:52Z

Exactly. One could probably do a more fine-grained analysis, similar to the permutation argument polys, arguing statistical zero-knowledge, but I need to think about it.

ulrich-haboeck · 2024-10-18T12:31:41Z

After another round of contemplation, I see the following problem with the current statistically zero-knowledge approach: We take several base field samples, instead of a single one from the extension field. While this is a good approach for amplifying soundness, it actually is bad for statistical zero-knowledge. The verifier learns that not just for one, but for several (X,Y)-samples all of the linear terms (X - x - w_i(x)*Y ) are non zero. Formally, this is reflected in an increasing statistical distance to uniform distribution (over the space of all possible transcripts), blowing up from the simple fraction of bad points

<= total_num_witness_elements / |F|

to its n-fold (for n samples). In any case a distance that is too large for small fields such as Goldilocks.

That being said, I see only to options to remedy this issue:
Either, we

drop the several-base-field-samples approach, or we
implement perfect zero-knowledge.

The latter is actually quite costly for the Plonk permutation argument (a side effect that I missed in my above elaboration in the single-column setting), practically doubling the number of 2-nd round polynomials (in comparison to non-zk). Besides, in the world of hash-based proofs, perfect zk of the IOP is downgraded to statistical zk anyways.
For this reason I personally would opt for first option, but it is not me to decide @dlubarov @Nashtare @LindaGuiga @Al-Kindi-0.

Al-Kindi-0 · 2024-10-18T12:51:25Z

After another round of contemplation, I see the following problem with the current statistically zero-knowledge approach: We take several base field samples, instead of a single one from the extension field. While this is a good approach for amplifying soundness, it actually is bad for statistical zero-knowledge. The verifier learns that not just for one, but for several (X,Y)-samples all of the linear terms (X - x - w_i(x)*Y ) are non zero. Formally, this is reflected in an increasing statistical distance to uniform distribution (over the space of all possible transcripts), blowing up from the simple fraction of bad points
<= total_num_witness_elements / |F|
to its n-fold (for n samples). In any case a distance that is too large for small fields such as Goldilocks.

Fully agree, the case of challenges from the base field is worse from the statistical distance point of view. This gets worse the larger the witness size gets.

That being said, I see only to options to remedy this issue: Either, we
* drop the several-base-field-samples approach, or we

* implement perfect zero-knowledge.
The latter is actually quite costly for the Plonk permutation argument (a side effect that I missed in my above elaboration in the single-column setting), practically doubling the number of 2-nd round polynomials (in comparison to non-zk). Besides, in the world of hash-based proofs, perfect zk of the IOP is downgraded to statistical zk anyways. For this reason I personally would opt for first option, but it is not me to decide @dlubarov @Nashtare @LindaGuiga @Al-Kindi-0.

Just to clarify, the doubling of the number of polynomials in the second round is due to the increase in the degree, right?

My proposal is to go with statistical zero-knowledge but give an explicit bound on the statistical distance.

ulrich-haboeck · 2024-10-18T13:51:18Z

@Al-Kindi-0 exactly, the doubling of second round polys is due to the increased degree of the constraints.

LindaGuiga added 6 commits September 3, 2024 18:44

Start implementation

a684e1b

Fix serialization

b7b6741

Fix recursive verifier

bda10fe

Start fixing non-zk case

1650fee

Fix bench_recursion and a bit of cleanup

b79391b

Change the way arities are computed

e7bdfb7

Nashtare assigned LindaGuiga Sep 10, 2024

Nashtare commented Sep 10, 2024

View reviewed changes

LindaGuiga and others added 9 commits September 11, 2024 13:05

Fix fri arities and degrees

b49f30c

Apply comments and a bit of cleanup

c1a878c

Fix inner config in PrecomputedReducedOpeningsTarget::from_os_and_alpha

1468eea

Fix clippy and some cleanup

86f5f44

Start implementing zk for quotient polys

5f33cdb

Clean a tiny bit

df43c71

Fix zk for quotient polynomials and a cleanup a bit

d64ef5d

Cleanup

4e8a45f

Merge branch 'main' into simpler-zk-treatment

7298d83

LindaGuiga marked this pull request as ready for review September 17, 2024 11:21

LindaGuiga requested a review from muursh as a code owner September 17, 2024 11:21

LindaGuiga requested a review from ulrich-haboeck September 17, 2024 11:41

Nashtare added this to the System strengthening milestone Sep 17, 2024

Nashtare added the soundness Soundness related changes label Sep 25, 2024

4l0n50 reviewed Sep 27, 2024

View reviewed changes

Apply comments

aaab023

LindaGuiga requested a review from wborgeaud as a code owner September 30, 2024 13:00

LindaGuiga and others added 3 commits September 30, 2024 15:04

Revert accidental change

b075770

Merge branch 'main' into simpler-zk-treatment

6aa4ca6

Reduce degree of R to |H|. Update num_blinding_gates and computed_h

0080b86

ulrich-haboeck reviewed Oct 15, 2024

View reviewed changes

LindaGuiga added 2 commits October 17, 2024 16:41

Apply comments

8a00cda

Fix batch fri oracle

ae41b3f

Merge branch 'main' into simpler-zk-treatment

5c44217

Add random values to the SLDC and RE lookup polynomials

39d94cf

Nashtare removed the request for review from muursh November 6, 2024 15:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proper ZK treatment in `plonky2` #1625

Proper ZK treatment in `plonky2` #1625

Nashtare commented Sep 10, 2024

LindaGuiga commented Sep 17, 2024 •

edited

Loading

4l0n50 left a comment

4l0n50 commented Sep 27, 2024 •

edited

Loading

LindaGuiga commented Oct 11, 2024

ulrich-haboeck left a comment •

edited

Loading

ulrich-haboeck Oct 15, 2024

ulrich-haboeck Oct 15, 2024

LindaGuiga Oct 16, 2024

ulrich-haboeck commented Oct 15, 2024

ulrich-haboeck commented Oct 15, 2024

ulrich-haboeck commented Oct 16, 2024 •

edited

Loading

ulrich-haboeck commented Oct 16, 2024

LindaGuiga commented Oct 18, 2024

ulrich-haboeck commented Oct 18, 2024

ulrich-haboeck commented Oct 18, 2024

Al-Kindi-0 commented Oct 18, 2024

ulrich-haboeck commented Oct 18, 2024

Proper ZK treatment in plonky2 #1625

Are you sure you want to change the base?

Proper ZK treatment in plonky2 #1625

Conversation

Nashtare commented Sep 10, 2024

LindaGuiga commented Sep 17, 2024 • edited Loading

4l0n50 left a comment

Choose a reason for hiding this comment

4l0n50 commented Sep 27, 2024 • edited Loading

LindaGuiga commented Oct 11, 2024

ulrich-haboeck left a comment • edited Loading

Choose a reason for hiding this comment

ulrich-haboeck Oct 15, 2024

Choose a reason for hiding this comment

ulrich-haboeck Oct 15, 2024

Choose a reason for hiding this comment

LindaGuiga Oct 16, 2024

Choose a reason for hiding this comment

ulrich-haboeck commented Oct 15, 2024

ulrich-haboeck commented Oct 15, 2024

ulrich-haboeck commented Oct 16, 2024 • edited Loading

ulrich-haboeck commented Oct 16, 2024

LindaGuiga commented Oct 18, 2024

ulrich-haboeck commented Oct 18, 2024

ulrich-haboeck commented Oct 18, 2024

Al-Kindi-0 commented Oct 18, 2024

ulrich-haboeck commented Oct 18, 2024

Proper ZK treatment in `plonky2` #1625

Proper ZK treatment in `plonky2` #1625

LindaGuiga commented Sep 17, 2024 •

edited

Loading

4l0n50 commented Sep 27, 2024 •

edited

Loading

ulrich-haboeck left a comment •

edited

Loading

ulrich-haboeck commented Oct 16, 2024 •

edited

Loading