-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(integer): add count_ones/zeros #1503
Conversation
df1bfcc
to
1e2d237
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, some questions
tfhe/src/integer/server_key/radix_parallel/tests_signed/test_count_zeros_ones.rs
Outdated
Show resolved
Hide resolved
// In 2_2, each block may have between 0 and 2 bits set. | ||
// 2_2 also allows 5 additions maximum (noise wise) | ||
// 2 * 5 = 10 which is less than the max value storable (15 = (2**4) -1) | ||
// | ||
// Since in 2_2 bivariate PBS is possible, we can actually group blocks by two. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably not applicable but kind of makes me think of it for the eq and ne implems ? as we will get a slower version to be safe at the moment
/// * ct must not have any carries | ||
/// * The returned result has enough blocks to encrypt 32bits (e.g. 1_1 parameters -> 32 blocks, | ||
/// 3_3 parameters -> 11 blocks == 33 bits) | ||
fn count_bits_2_2<T>(&self, ct: &T, count_kind: BitCountKind) -> RadixCiphertext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perf gains of this variant ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said in the commit its none on small sizes (< 64bits) and rather small on 64 bits (iirc naive 64 bits its ~135 ms, while non naive is ~115ms), where its really interesting is for really big sizes like 12800 bits where we go from 1.8s for naive, 1.1 sec for non-naive likely because the number of pbs reduced starts to be noticeable and there are less things to sum
1e2d237
to
649673b
Compare
The non naive version made for 2_2 parameters only bring slight (10-15%) for some small sizes like (64, 128, 256 bits) but reduces number of PBS. The place where it brings the best improvements it for very large numbers (e.g 6400 blocks 1.8s for naive, 1.1 sec for non-naive)
649673b
to
62858b0
Compare
Changes since last time is just a rebase + small comment adressed + use of the |
The non naive version made for 2_2 parameters
only bring slight (10-15%) for some small sizes like (64, 128, 256 bits) but reduces number of PBS. The place where it brings the best improvements it for very large numbers (e.g 6400 blocks 1.8s for naive, 1.1 sec for non-naive)
fixes https://github.com/zama-ai/tfhe-rs-internal/issues/638