-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simd twiddles #820
base: spapini/09-05-parallel_fft
Are you sure you want to change the base?
Simd twiddles #820
Conversation
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @spapinistarkware and the rest of your teammates on Graphite |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## spapini/09-05-parallel_fft #820 +/- ##
==============================================================
- Coverage 92.80% 92.78% -0.03%
==============================================================
Files 89 89
Lines 12103 12165 +62
Branches 12103 12165 +62
==============================================================
+ Hits 11232 11287 +55
- Misses 764 771 +7
Partials 107 107 ☔ View full report in Codecov by Sentry. |
2a4fec2
to
b3b9ee0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions (waiting on @Alon-Ti, @shaharsamocha7, and @spapinistarkware)
crates/prover/src/core/backend/simd/circle.rs
line 282 at r1 (raw file):
#[allow(clippy::int_plus_one)] fn precompute_twiddles(mut coset: Coset) -> TwiddleTree<Self> {
Please add a unit test to compare with the CPU implementation.
crates/prover/src/core/backend/simd/circle.rs
line 317 at r1 (raw file):
} xs.push(PackedM31::from_array(extra.try_into().unwrap()));
Does this try into work because it's initialized with capacity N_LANES?
Code quote:
xs.push(PackedM31::from_array(extra.try_into().unwrap()));
crates/prover/src/core/backend/simd/circle.rs
line 340 at r1 (raw file):
#[allow(clippy::int_plus_one)] fn gen_coset_xs(coset: Coset, res: &mut Vec<PackedM31>) {
Can you add documentation? Maybe a reference to the algorithm
crates/prover/src/core/backend/simd/circle.rs
line 344 at r1 (raw file):
assert!(log_size >= LOG_N_LANES); let initial_points = std::array::from_fn(|i| coset.at(bit_reverse_index(i, log_size)));
Can this bit reverse be done for the whole coset at once?
Code quote:
let initial_points = std::array::from_fn(|i| coset.at(bit_reverse_index(i, log_size)));
This change is