-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do multiplicity witgen in separate stage #2319
Conversation
@@ -418,7 +546,7 @@ impl<'a, T: FieldElement> FixedData<'a, T> { | |||
.definitions | |||
.iter() | |||
.filter(|(_, (symbol, _))| matches!(symbol.kind, SymbolKind::Poly(_))) | |||
.map(|(name, (symbol, _))| (name.clone(), symbol.into())) | |||
.flat_map(|(_, (symbol, _))| symbol.array_elements()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bug IMO, didn't expand array elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Symbol should not implement this Into, or at least change to TryInto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, but this can be a different PR. I just checked and it is only used several times in Analyzed::remove_definitions
.
a3f43b3
to
4b0e9d1
Compare
self.fixed | ||
.fixed_cols | ||
.iter() | ||
// TODO: Avoid clone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should not be too hard, OwnedTerminalValues
could do everything it does with a reference, but that's a different PR.
} | ||
|
||
pub fn row(&self, row: usize) -> RowValues<F> { | ||
RowValues { values: self, row } | ||
} | ||
|
||
pub fn destroy(self) -> BTreeMap<PolyID, Vec<F>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
into_trace? into_inner?
executor/src/witgen/mod.rs
Outdated
@@ -418,7 +383,8 @@ impl<'a, T: FieldElement> FixedData<'a, T> { | |||
.definitions | |||
.iter() | |||
.filter(|(_, (symbol, _))| matches!(symbol.kind, SymbolKind::Poly(_))) | |||
.map(|(name, (symbol, _))| (name.clone(), symbol.into())) | |||
.flat_map(|(_, (symbol, _))| symbol.array_elements()) | |||
// .map(|(name, (symbol, _))| (name.clone(), symbol.into())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leftover commented out code
let indices = lhs_tuples | ||
.into_par_iter() | ||
.map(|(_, tuple)| index[&tuple]) | ||
.collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to collect here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, because this is a parallel iterator, which is not a proper iterator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments but looks really good and gets rid of a lot of sub optimal code. I think you mentioned multiplicities being computed twice because they are sometimes needed early, why did that change?
Thanks for the quick review! I applied your suggestions.
So I think rn, this doesn't happen, because the code so far only applies to phantom lookups, and we never use the multiplicity for anything else besides the lookup. I think that will change once we use the bus natively. It would work something like this:
But we use bus interactions also to express permutation arguments, for example, but in that case the multiplicity is what's now the RHS selector, which we do use otherwise sometimes (e.g. to find out if a block is an actual block or just fulling up rows). So I think that it could be that we end up computing the multiplicity twice. |
While working on #2306, @Schaeff came across several bugs in multiplicity witness generation. These were undetected, because we ignored multiplicities in the mock prover, which will be fixed by #2310. With this PR, #2310 will be green. The issue was that counting multiplicities inside `Machine::process_plookup()` fails if the caller actually discards the result. This happens in a few places, for example during our loop optimization in the "dynamic machine". With this PR, we instead have a centralized `MultiplicityColumnGenerator` that counts multiplicities after the fact, by going over each lookup, evaluating the two selected tuples on all rows, and counting how often each element in the LHS appears in the RHS. To measure the runtime of this, I ran: ```sh export TEST=keccak export POWDR_JIT_OPT_LEVEL=0 cargo run -r --bin powdr-rs compile riscv/tests/riscv_data/$TEST -o output --max-degree-log 18 cargo run -r --features plonky3,halo2 pil output/$TEST.asm -o output -f --field gl --linker-mode bus ``` I get the following profile on the server: ``` == Witgen profile (2554126 events) 32.4% ( 2.6s): Secondary machine 0: main_binary (BlockMachine) 23.1% ( 1.9s): Main machine (Dynamic) 12.7% ( 1.0s): Secondary machine 4: main_regs (DoubleSortedWitnesses32) 10.0% ( 809.9ms): FixedLookup 7.7% ( 621.1ms): Secondary machine 5: main_shift (BlockMachine) 5.6% ( 454.6ms): Secondary machine 2: main_poseidon_gl (BlockMachine) 3.8% ( 312.3ms): multiplicity witgen 3.8% ( 308.2ms): witgen (outer code) 0.6% ( 45.3ms): Secondary machine 1: main_memory (DoubleSortedWitnesses32) 0.4% ( 33.4ms): Secondary machine 6: main_split_gl (BlockMachine) 0.0% ( 8.0µs): Secondary machine 3: main_publics (WriteOnceMemory) --------------------------- ==> Total: 8.114630092s ``` So the cost is ~4%. I'm sure it can be optimized further but I would like to leave this to a future PR.
This is not necessary anymore since #2319: We previously had a multiplicity witgen pass only for range constraints via lookups. Now we have it for all phantom lookups, which includes range constraints via lookups.
While working on #2306, @Schaeff came across several bugs in multiplicity witness generation. These were undetected, because we ignored multiplicities in the mock prover, which will be fixed by #2310. With this PR, #2310 will be green.
The issue was that counting multiplicities inside
Machine::process_plookup()
fails if the caller actually discards the result. This happens in a few places, for example during our loop optimization in the "dynamic machine".With this PR, we instead have a centralized
MultiplicityColumnGenerator
that counts multiplicities after the fact, by going over each lookup, evaluating the two selected tuples on all rows, and counting how often each element in the LHS appears in the RHS.To measure the runtime of this, I ran:
I get the following profile on the server:
So the cost is ~4%. I'm sure it can be optimized further but I would like to leave this to a future PR.