Do multiplicity witgen in separate stage #2319

georgwiese · 2025-01-08T19:53:58Z

While working on #2306, @Schaeff came across several bugs in multiplicity witness generation. These were undetected, because we ignored multiplicities in the mock prover, which will be fixed by #2310. With this PR, #2310 will be green.

The issue was that counting multiplicities inside Machine::process_plookup() fails if the caller actually discards the result. This happens in a few places, for example during our loop optimization in the "dynamic machine".

With this PR, we instead have a centralized MultiplicityColumnGenerator that counts multiplicities after the fact, by going over each lookup, evaluating the two selected tuples on all rows, and counting how often each element in the LHS appears in the RHS.

To measure the runtime of this, I ran:

export TEST=keccak
export POWDR_JIT_OPT_LEVEL=0

cargo run -r --bin powdr-rs compile riscv/tests/riscv_data/$TEST -o output --max-degree-log 18
cargo run -r --features plonky3,halo2 pil output/$TEST.asm -o output -f --field gl --linker-mode bus

I get the following profile on the server:

 == Witgen profile (2554126 events)
   32.4% (    2.6s): Secondary machine 0: main_binary (BlockMachine)
   23.1% (    1.9s): Main machine (Dynamic)
   12.7% (    1.0s): Secondary machine 4: main_regs (DoubleSortedWitnesses32)
   10.0% ( 809.9ms): FixedLookup
    7.7% ( 621.1ms): Secondary machine 5: main_shift (BlockMachine)
    5.6% ( 454.6ms): Secondary machine 2: main_poseidon_gl (BlockMachine)
    3.8% ( 312.3ms): multiplicity witgen
    3.8% ( 308.2ms): witgen (outer code)
    0.6% (  45.3ms): Secondary machine 1: main_memory (DoubleSortedWitnesses32)
    0.4% (  33.4ms): Secondary machine 6: main_split_gl (BlockMachine)
    0.0% (   8.0µs): Secondary machine 3: main_publics (WriteOnceMemory)
  ---------------------------
    ==> Total: 8.114630092s

So the cost is ~4%. I'm sure it can be optimized further but I would like to leave this to a future PR.

georgwiese · 2025-01-08T20:22:40Z

executor/src/witgen/mod.rs

@@ -418,7 +546,7 @@ impl<'a, T: FieldElement> FixedData<'a, T> {
                .definitions
                .iter()
                .filter(|(_, (symbol, _))| matches!(symbol.kind, SymbolKind::Poly(_)))
-                .map(|(name, (symbol, _))| (name.clone(), symbol.into()))
+                .flat_map(|(_, (symbol, _))| symbol.array_elements())


This was a bug IMO, didn't expand array elements.

Maybe Symbol should not implement this Into, or at least change to TryInto

Agree, but this can be a different PR. I just checked and it is only used several times in Analyzed::remove_definitions.

georgwiese · 2025-01-09T03:19:45Z

executor/src/witgen/multiplicity_column_generator.rs

+                self.fixed
+                    .fixed_cols
+                    .iter()
+                    // TODO: Avoid clone


I think this should not be too hard, OwnedTerminalValues could do everything it does with a reference, but that's a different PR.

Schaeff · 2025-01-09T10:16:48Z

executor-utils/src/expression_evaluator.rs

    }

    pub fn row(&self, row: usize) -> RowValues<F> {
        RowValues { values: self, row }
    }
+
+    pub fn destroy(self) -> BTreeMap<PolyID, Vec<F>> {


into_trace? into_inner?

Schaeff · 2025-01-09T10:22:59Z

executor/src/witgen/mod.rs

@@ -418,7 +383,8 @@ impl<'a, T: FieldElement> FixedData<'a, T> {
                .definitions
                .iter()
                .filter(|(_, (symbol, _))| matches!(symbol.kind, SymbolKind::Poly(_)))
-                .map(|(name, (symbol, _))| (name.clone(), symbol.into()))
+                .flat_map(|(_, (symbol, _))| symbol.array_elements())
+                // .map(|(name, (symbol, _))| (name.clone(), symbol.into()))


Leftover commented out code

executor/src/witgen/multiplicity_column_generator.rs

Schaeff · 2025-01-09T10:35:55Z

executor/src/witgen/multiplicity_column_generator.rs

+            let indices = lhs_tuples
+                .into_par_iter()
+                .map(|(_, tuple)| index[&tuple])
+                .collect::<Vec<_>>();


Do we have to collect here?

I think so, because this is a parallel iterator, which is not a proper iterator.

executor/src/witgen/multiplicity_column_generator.rs

Schaeff

Some small comments but looks really good and gets rid of a lot of sub optimal code. I think you mentioned multiplicities being computed twice because they are sometimes needed early, why did that change?

georgwiese · 2025-01-09T14:19:56Z

Thanks for the quick review! I applied your suggestions.

I think you mentioned multiplicities being computed twice because they are sometimes needed early, why did that change?

So I think rn, this doesn't happen, because the code so far only applies to phantom lookups, and we never use the multiplicity for anything else besides the lookup.

I think that will change once we use the bus natively. It would work something like this:

For each bus receive:
- Build the index (same as the lookup RHS now)
- Find all matching sends (there might be several). For each:
  - Collect all tuples (same as lookup LHS now)
  - Update multiplicities

But we use bus interactions also to express permutation arguments, for example, but in that case the multiplicity is what's now the RHS selector, which we do use otherwise sometimes (e.g. to find out if a block is an actual block or just fulling up rows). So I think that it could be that we end up computing the multiplicity twice.

@Schaeff

While working on #2306, @Schaeff came across several bugs in multiplicity witness generation. These were undetected, because we ignored multiplicities in the mock prover, which will be fixed by #2310. With this PR, #2310 will be green. The issue was that counting multiplicities inside `Machine::process_plookup()` fails if the caller actually discards the result. This happens in a few places, for example during our loop optimization in the "dynamic machine". With this PR, we instead have a centralized `MultiplicityColumnGenerator` that counts multiplicities after the fact, by going over each lookup, evaluating the two selected tuples on all rows, and counting how often each element in the LHS appears in the RHS. To measure the runtime of this, I ran: ```sh export TEST=keccak export POWDR_JIT_OPT_LEVEL=0 cargo run -r --bin powdr-rs compile riscv/tests/riscv_data/$TEST -o output --max-degree-log 18 cargo run -r --features plonky3,halo2 pil output/$TEST.asm -o output -f --field gl --linker-mode bus ``` I get the following profile on the server: ``` == Witgen profile (2554126 events) 32.4% ( 2.6s): Secondary machine 0: main_binary (BlockMachine) 23.1% ( 1.9s): Main machine (Dynamic) 12.7% ( 1.0s): Secondary machine 4: main_regs (DoubleSortedWitnesses32) 10.0% ( 809.9ms): FixedLookup 7.7% ( 621.1ms): Secondary machine 5: main_shift (BlockMachine) 5.6% ( 454.6ms): Secondary machine 2: main_poseidon_gl (BlockMachine) 3.8% ( 312.3ms): multiplicity witgen 3.8% ( 308.2ms): witgen (outer code) 0.6% ( 45.3ms): Secondary machine 1: main_memory (DoubleSortedWitnesses32) 0.4% ( 33.4ms): Secondary machine 6: main_split_gl (BlockMachine) 0.0% ( 8.0µs): Secondary machine 3: main_publics (WriteOnceMemory) --------------------------- ==> Total: 8.114630092s ``` So the cost is ~4%. I'm sure it can be optimized further but I would like to leave this to a future PR.

This is not necessary anymore since #2319: We previously had a multiplicity witgen pass only for range constraints via lookups. Now we have it for all phantom lookups, which includes range constraints via lookups.

georgwiese changed the base branch from main to check-multiplicities-in-mock-prover January 8, 2025 19:54

georgwiese commented Jan 8, 2025

View reviewed changes

Do multiplicity witgen in separate stage

4b0e9d1

georgwiese force-pushed the fix-multiplicity-witgen2 branch from a3f43b3 to 4b0e9d1 Compare January 9, 2025 02:48

georgwiese changed the base branch from check-multiplicities-in-mock-prover to main January 9, 2025 02:49

georgwiese changed the title ~~Fix multiplicity witgen~~ Do multiplicity witgen in separate stage Jan 9, 2025

Bug fix

bcb068c

georgwiese commented Jan 9, 2025

View reviewed changes

georgwiese marked this pull request as ready for review January 9, 2025 03:22

georgwiese added 3 commits January 9, 2025 04:26

Clippy

aee1576

BTreeMap -> HashMap => slight performance increase

4182d8a

Parallelize index lookup => 30% faster

b352e15

Schaeff reviewed Jan 9, 2025

View reviewed changes

executor/src/witgen/multiplicity_column_generator.rs Show resolved Hide resolved

Schaeff reviewed Jan 9, 2025

View reviewed changes

executor/src/witgen/multiplicity_column_generator.rs Show resolved Hide resolved

Schaeff reviewed Jan 9, 2025

View reviewed changes

Review feedback

cb35701

georgwiese requested a review from Schaeff January 9, 2025 14:11

Schaeff approved these changes Jan 9, 2025

View reviewed changes

Schaeff added this pull request to the merge queue Jan 9, 2025

Merged via the queue into main with commit e328eb9 Jan 9, 2025
16 checks passed

Schaeff deleted the fix-multiplicity-witgen2 branch January 9, 2025 15:27

georgwiese mentioned this pull request Jan 17, 2025

Remove tracking of range constraint multiplicities #2356

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do multiplicity witgen in separate stage #2319

Do multiplicity witgen in separate stage #2319

georgwiese commented Jan 8, 2025 •

edited

Loading

georgwiese Jan 8, 2025

Schaeff Jan 9, 2025

georgwiese Jan 9, 2025

georgwiese Jan 9, 2025 •

edited

Loading

Schaeff Jan 9, 2025

Schaeff Jan 9, 2025

Schaeff Jan 9, 2025

georgwiese Jan 9, 2025

Schaeff left a comment

georgwiese commented Jan 9, 2025

Do multiplicity witgen in separate stage #2319

Do multiplicity witgen in separate stage #2319

Conversation

georgwiese commented Jan 8, 2025 • edited Loading

georgwiese Jan 8, 2025

Choose a reason for hiding this comment

Schaeff Jan 9, 2025

Choose a reason for hiding this comment

georgwiese Jan 9, 2025

Choose a reason for hiding this comment

georgwiese Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

Schaeff Jan 9, 2025

Choose a reason for hiding this comment

Schaeff Jan 9, 2025

Choose a reason for hiding this comment

Schaeff Jan 9, 2025

Choose a reason for hiding this comment

georgwiese Jan 9, 2025

Choose a reason for hiding this comment

Schaeff left a comment

Choose a reason for hiding this comment

georgwiese commented Jan 9, 2025

georgwiese commented Jan 8, 2025 •

edited

Loading

georgwiese Jan 9, 2025 •

edited

Loading