TFHE-rs v0.8.0
·
597 commits
to main
since this release
Summary
TFHE-rs v0.8.0 includes several enhancements and new features, here are the highlights:
- Array types: Simplify working with vectors and tensors of integer ciphertexts.
- CPU algorithms optimization: integer algorithms have been optimized: the 64 bits multiplication is now 16% faster for the default parameter set.
- Single GPU performance improvement: Thanks to optimizations in the Programmable Bootstrap and the Fast Fourier Transform CUDA implementations, the performance has been improved by approximately 20%.
- Multi-GPU support improvement:
- All Nvidia GPUs can now be used in the computations, including those connected with PCIe.
- NVLink connections between GPUs are used for memory transfers when available.
- Default GPU parameters: It's no longer necessary to modify cryptographic parameters when using GPU acceleration with TFHE-rs.
- Compression and decompression on the GPU: Ciphertext compression and decompression are now supported on GPUs, along with new integer operations.
What's Changed
Breaking Changes
Warning
safe_serialize_versioned
/safe_deserialize_versioned
have been removed, andsafe_serialize
/safe_deserialize
now adds versioning to the serialized types. For more flexibility, you can useSerializationConfig
andDeserializationConfig
.- The
CiphertextList
trait must be in scope to use the common methods of theCompressedCiphertextList
andCompactCiphertextListExpander
. - With the addition of the tagging system for HL API structs, raw parts APIs have been updated to manage the new tag field on relevant structs.
- Expansion of
CompactCiphertextList
andProvenCompactCiphertextList
now takes a signleIntegerCompactCiphertextListExpansionMode
to manage keyswitching and applying lookup tables when required. - The encrypted pseudo random generation API has changed.
tfhe-zk-pok
andTFHE-rs
APIs now support custom metadata passed by users at encryption time.
New features
CPU
- Add array types
- Add a tag system to annotate structs with custom metadata
- Add versioning to the
KeySwitchingKey
- Add missing raw parts APIs in the HL API
- Add is_even/is_odd
- Add ability to use safe serialization on key types
- Add random encrypted
FheBool
generation - Add conformance to
ProvenCompactCiphertextList
- Add key conformance
- Add integer bit slicing
- Add count zeros/ones
- ZK-POK: add ability to associate metadata to a proof
- Add ability to construct a
ClientKey
from a user provided secret encryption key inshortint
GPU
- Signed integer overflowing add
- Signed integer overflowing sub
- Signed integer overflowing scalar add
- Signed integer overflowing scalar sub
- Log2, trailing and leading zeros and ones
- Signed & unsigned integer is even / is odd
- Ciphertext compression
Improvements
CPU
- Improve carry propagation performance, this positively impacts, add, sub, mul, div and comparisons
- Improve performance in some cases during
CompactCiphertextList
expansion - Improve performance of non native modulus operations
- WASM: add ability to encrypt u{512, 1024, 2048} with a
CompactPublicKey
- WASM: add ability to read the kind of an encrypted slot in a
CompactListExpander
- ZK-POK: improve performance on WASM for browser execution
- ZK-POK: improve performance when proving less bits than what a proof can hold
- ZK-POK: add versioning
GPU
- Configure GPU parameters automatically to GPU multi-bit dedicated parameters
- Optimize integer scalar multiplication memory use on the GPU
- Optimize multiplication memory usage
- Speedup twiddles reads
- Pin bootstrap key host memory to speedup its copy to multiple GPUs
- Multi GPU: dispatch/gather inputs and outputs to the ks/pbs on all GPUs
- Implements FFT with reduced shared memory read/write
Fixes
CPU
- Fix wrong
Named
implementation forCompressedCiphertextList
- Fix Client/Server Key versioning
- Fix
CompactCiphertextList
'sexpand_with_key
which could fail to expand lists in certain circumstances - Remove double carry propagation in sub
- Versioning: fix the bounds added in the derived traits for the
Versionize
macro which were sometimes unsatisfiable
GPU
- Fix add with 1 block
- Fix a memory error in multiplication
- Fix a memory error in scalar shifts
- Fix full propagation with 1 block
- Fix a memory error in bitnot
Resources
- Documentation: