Skip to content

Commit

Permalink
Merge branch 'master' into mips/keccak/proof-test
Browse files Browse the repository at this point in the history
  • Loading branch information
dannywillems committed Jan 19, 2024
2 parents f8f9d12 + 8788f9d commit 356493a
Show file tree
Hide file tree
Showing 33 changed files with 4,901 additions and 624 deletions.
829 changes: 392 additions & 437 deletions Cargo.lock

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ members = [
"groupmap",
"hasher",
"kimchi",
"kimchi/snarky-deriver/",
"optimism",
"poseidon",
"poseidon/export_test_vectors",
Expand Down
73 changes: 73 additions & 0 deletions book/src/snarky/booleans.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Booleans

Booleans are a good example of a [snarky variable](./vars.md#snarky-vars).

```rust
pub struct Boolean<F: PrimeField>(FieldVar<F>);

impl<F> SnarkyType<F> for Boolean<F>
where
F: PrimeField,
{
type Auxiliary = ();

type OutOfCircuit = bool;

const SIZE_IN_FIELD_ELEMENTS: usize = 1;

fn to_cvars(&self) -> (Vec<FieldVar<F>>, Self::Auxiliary) {
(vec![self.0.clone()], ())
}

fn from_cvars_unsafe(cvars: Vec<FieldVar<F>>, _aux: Self::Auxiliary) -> Self {
assert_eq!(cvars.len(), Self::SIZE_IN_FIELD_ELEMENTS);
Self(cvars[0].clone())
}

fn check(&self, cs: &mut RunState<F>) {
// TODO: annotation?
cs.assert_(Some("boolean check"), vec![BasicSnarkyConstraint::Boolean(self.0.clone())]);
}

fn deserialize(&self) -> (Self::OutOfCircuit, Self::Auxiliary) {
todo!()
}

fn serialize(out_of_circuit: Self::OutOfCircuit, aux: Self::Auxiliary) -> Self {
todo!()
}

fn constraint_system_auxiliary() -> Self::Auxiliary {
todo!()
}

fn value_to_field_elements(x: &Self::OutOfCircuit) -> (Vec<F>, Self::Auxiliary) {
todo!()
}

fn value_of_field_elements(x: (Vec<F>, Self::Auxiliary)) -> Self::OutOfCircuit {
todo!()
}
}
```

## Check

The `check()` function is simply constraining the `FieldVar` $x$ to be either $0$ or $1$ using the following constraint:

$$x ( x - 1) = 0$$

It is trivial to use the [double generic gate](../specs/kimchi.md#double-generic-gate) for this.

## And

$$x \land y = x \times y$$

## Not

$$\sim x = 1 - x$$

## Or

* $\sim x \land \sim y = b$
* $x \lor y = \sim b$
29 changes: 29 additions & 0 deletions book/src/snarky/circuit-generation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# Circuit generation

In circuit generation mode, the `has_witness` field of `RunState` is set to the default `CircuitGeneration`, and the program of the user is ran to completion.

During the execution, the different snarky functions called on `RunState` will create [internal variables](./vars.md) as well as constraints.

## Creation of variables

[Variables](./vars.md) can be created via the `compute()` function, which takes two arguments:

* A `TypeCreation` toggle, which is either set to `Checked` or `Unsafe`. We will describe this below.
* A closure representing the actual computation behind the variable. This computation will only take place when real values are computed, and can be non-deterministic (e.g. random, or external values provided by the user). Note that the closure takes one argument: a `WitnessGeneration`, a structure that allows you to read the runtime values of any variables that were previously created in your program.

The `compute()` function also needs a type hint to understand what type of [snarky type](./vars.md#snarky-vars) it is creating.

It then performs the following steps:

* creates enough [`FieldVar`](./vars#circuit-vars) to hold the value to be created
* retrieves the auxiliary data needed to create the snarky type (TODO: explain auxiliary data) and create the [`snarky variable`](./vars.md#snarky-vars) out of the `FieldVar`s and the auxiliary data
* if the `TypeCreation` is set to `Checked`, call the `check()` function on the snarky type (which will constrain the value created), if it is set to `Unsafe` do nothing (in which case we're trusting that the value cannot be malformed, this is mostly used internally and it is highly-likely that users directly making use of `Unsafe` are writing bugs)

```admonish
At this point we only created variables to hold future values, and made sure that they are constrained.
The actual values will fill the room created by the `FieldVar` only during the [witness generation](./witness-generation.md).
```

## Constraints

All other functions exposed by the API are basically here to operate on variables and create constraints in doing so.
271 changes: 271 additions & 0 deletions book/src/snarky/kimchi-backend.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,271 @@
# Kimchi Backend

![](https://i.imgur.com/KmKU5Pl.jpg)

Underneath the snarky wrapper (in `snarky/checked_runner.rs`) lies what we used to call the `plonk_constraint_system` or `kimchi_backend` in `snarky/constraint_systen.rs`.

```admonish
It is good to note that we're planning on removing this abstract separation between the snarky wrapper and the constraint system.
```

The logic in the kimchi backend serves two purposes:

* **Circuit generation**. It is the logic that adds gates to our list of gates (representing the circuit). For most of these gates, the variables used are passed to the backend by the snarky wrapper, but some of them are created by the backend itself (see more in the [variables section](#variables)).
* **Witness generation**. It is the logic that creates the witness

One can also perform two additional operations once the constraint system has been compiled:

* Generate the prover and verifier index for the system.
* Get a hash of the constraint system (this includes the circuit, the number of public input) (TODO: verify that this is true) (TODO: what else should be in that hash? a version of snarky and a version of kimchi?).

## A circuit

A circuit is either being built, or has been contructed during a circuit generation phase:

```rust
enum Circuit<F>
where
F: PrimeField,
{
/** A circuit still being written. */
Unfinalized(Vec<GateSpec<(), F>>),
/** Once finalized, a circuit is represented as a digest
and a list of gates that corresponds to the circuit.
*/
Compiled([u8; 32], Vec<CircuitGate<F>>),
}
```

## State

The state of the kimchi backend looks like this:

```rust
where
Field: PrimeField,
{
/// A counter used to track variables
/// (similar to the one in the snarky wrapper)
next_internal_var: usize,

/// Instruction on how to compute each internal variable
/// (as a linear combination of other variables).
/// Used during witness generation.
internal_vars: HashMap<InternalVar, (Vec<(Field, V)>, Option<Field>)>,

/// The symbolic execution trace table.
/// Each cell is a variable that takes a value during witness generation.
/// (if not set, it will take the value 0).
rows: Vec<Vec<Option<V>>>,

/// The circuit once compiled
gates: Circuit<Field>,

/// The row to use the next time we add a constraint.
// TODO: I think we can delete this
next_row: usize,

/// The size of the public input
/// (which fills the first rows of our constraint system.
public_input_size: Option<usize>,

// omitted values...
}
```

## Variables

In the backend, there's two types of variables:

```rust
enum V {
/// An external variable
/// (generated by snarky, via [exists]).
External(usize),

/// An internal variable is generated to hold an intermediate value,
/// (e.g. in reducing linear combinations to single PLONK positions).
Internal(InternalVar),
}
```

Internal variables are basically a `usize` pointing to a hashmap in the state.

That hashmap tells you how to compute the internal variable during witness generation: it is always a linear combination of other variables (and a constant).

## Circuit generation

During circuit generation, the snarky wrapper will make calls to the `add_constraint()` or `add_basic_snarky_constraint` function of the kimchi backend, specifying what gate to use and what variables to use in that gate.

At this point, the snarky wrapper might have some variables that are not yet tracked as such (with a counter).
Rather, they are constants, or they are a combination of other variables.
You can see that as a small AST representing how to compute a variable.
(See the [variables section](./vars.md#circuit-vars) for more details).

For this reason, they can hide a number of operations that haven't been constrained yet.
It is the role of the `add_constrain` logic to enforce that at this point constants, as well as linear combinations or scalings of variables, are encoded in the circuit.
This is done by adding enough generic gates (using the `reduce_lincom()` or `reduce_to_var()` functions).

```admonish
This is a remnant of an optimization targetting R1CS (in which additions are for free).
An issue with this approach is the following: imagine that two circuit variables are created from the same circuit variable, imagine also that the original circuit variable contained a long AST, then both variables might end up creating the same constraints to convert that AST.
Currently, snarkyjs and pickles expose a `seal()` function that allows you to reduce this issue, at the cost of some manual work and mental tracking on the developer.
We should probably get rid of this, while making sure that we can continue to optimize generic gates
(in some cases you can merge two generic gates in one (TODO: give an example of where that can happen)).
Another solution is to keep track of what was reduced, and reuse previous reductions (similar to how we handle constants).
```

It is during this "reducing" step that internal variables (known only to the kimchi backend) are created.

```admonish
The process is quite safe, as the kimchi backend cannot use the snarky wrapper variables directly (which are of type `FieldVar`).
Since the expected format (see the [variables section](#variables) is a number (of type `usize`), the only way to convert a non-tracked variable (constant, or scale, or linear combination) is to reduce it (and in the process constraining its value).
```

Depending on the gate being used, several constraints might be added via the `add_row()` function which does three things:

1. figure out if there's any wiring to be done
2. add a gate to our list of gates (representing the circuit)
3. add the variables to our _symbolic_ execution trace table (symbolic in the sense that nothing has values yet)

This process happens as the circuit is "parsed" and the constraint functions of the kimchi backend are called.

This does not lead to a finalized circuit, see the next section to see how that is done.

(TODO: ideally this should happen in the same step)

## Finalization of the circuit.

So far we've only talked about adding specific constraints to the circuit, but not about how public input are handled.

The `finalization()` function of the kimchi backend does the following:

* add as many generic rows as there are public inputs.
* construct the permutation
* computes a cache of the circuit (TODO: this is so unecessary)
* and other things that are not that important

## Witness generation

Witness generation happens by taking the finalized state (in the `compute_witness()` function) with a callback that can be used to retrieve the values of external variables (public input and public output).

The algorithm follows these steps using the symbolic execution table we built during circuit generation:

1. it initializes the execution trace table with zeros
2. go through the rows related to the public input and set the most-left column values to the ones obtained by the callback.
3. go through the other rows and compute the value of the variables left in the table

Variables in step 3. should either:

* be absent (`None`) and evaluated to the default value 0
* point to an external variable, in which case the closure passed can be used to retrieve the value
* be an internal variable, in which case the value is computed by evaluating the AST that was used to create it.

## Permutation

The permutation is used to wire cells of the execution trace table (specifically, cells belonging to the first 7 columns).
It is also known as "copy constraints".

```admonish
In snarky, the permutation is represented differently from kimchi, and thus needs to be converted to the kimchi's format before a proof can be created.
TODO: merge the representations
```

We use the permutation in ingenious ways to optimize circuits.
For example, we use it to encode each constants once, and wire it to places where it is used.
Another example, is that we use it to assert equality between two cells.

## Implementation details

There's two aspect of the implementation of the permutation, the first one is a hashmap of equivalence classes, which is used to track all the positions of a variable, the second one is making use of a [union find]() data structure to link variables that are equivalent (we'll talk about that after).

The two data structures are in the kimchi backend's state:

```rust
pub struct SnarkyConstraintSystem<Field>
where
Field: PrimeField,
{
equivalence_classes: HashMap<V, Vec<Position<Row>>>,
union_finds: disjoint_set::DisjointSet<V>,
// omitted fields...
}
```

### equivalence classes

As said previously, during circuit generation a symbolic execution trace table is created. It should look a bit like this (if there were only 3 columns and 4 rows):

| | 0 | 1 | 2 |
| :-: | :-: | :-: | :-:|
| 0 | v1 | v1 | |
| 1 | | v2 | |
| 2 | | v2 | |
| 3 | | | v1 |

From that, it should be clear that all the cells containing the variable `v1` should be connected,
and all the cells containing the variable `v2` should be as well.

The format that the permutation expects is a [cycle](https://en.wikipedia.org/wiki/Cyclic_permutation): a list of cells where each cell is linked to the next, the last one wrapping around and linking to the first one.

For example, a cycle for the `v1` variable could be:

```
(0, 0) -> (0, 1)
(0, 1) -> (3, 2)
(3, 2) -> (0, 0)
```

During circuit generation, a hashmap (called `equivalence_classes`) is used to track all the positions (row and column) of each variable.

During finalization, all the different cycles are created by looking at all the variables existing in the hashmap.

### Union finds

Sometimes, we know that two variables will have equivalent values due to an `assert_equal()` being called to link them.
Since we link two variables together, they need to be part of the same cycle, and as such we need to be able to detect that to construct correct cycles.

To do this, we use a [union find]() data structure, which allows us to easily find the unions of equivalent variables.

When an `assert_equal()` is called, we link the two variables together using the `union_finds` data structure.

During finalization, when we create the cycles, we use the `union_finds` data structure to find the equivalent variables.
We then create a new equivalence classes hashmap to merge the keys (variables) that are in the same set.
This is done before using the equivalence classes hashmap to construct the cycles.

## Validation during witness generation

We want to make sure that the program runs to completion, with the given input, before we start computing the proof (which is expensive).

On top of that, we also want to tell the user what line in their circuit has rejected their input. Like a normal program.

This is configurable, via a `eval_constraints` boolean in the state, and is set to `true` by default.

To perform this validation, we do two things:

**Constants**. In functions that assert, we always check if we're dealing with constants first. If we are dealing with constants, we don't create any constraints, and we directly check if the computation is correct. For example:

```rust
pub fn assert_equals(
&self,
state: &mut RunState<F>,
other: &FieldVar<F>,
) -> SnarkyResult<()> {
match (self, other) {
(FieldVar::Constant(x), FieldVar::Constant(y)) => {
if x == y {
Ok(())
} else {
Err(/* ... */)
}
}
() => state.add_constraint(/* ... */)
}
}
```

**Runtime values**. During witness generation, actual values are passed to the circuit and can also not pass an "asserting" function. In these cases, we directly check the computation via the gates.

What does it mean? For example, when we add a generic constraint during witness generation (which are backing a number of asserting functions), we check that the actual values satisfy the generic gate equation.

Note that not all gates are behind asserting functions, for example the Poseidon gate does not have "bad" inputs, and as such we don't do any checks on it. If for some reason the inputs are bad, it means that the user did this on purpose, and the proof will fail.
Loading

0 comments on commit 356493a

Please sign in to comment.