-
Notifications
You must be signed in to change notification settings - Fork 121
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
AES-GCM AArch64: Store swapped Htable values (#1403)
AArch64 assembly implementations of AES-GCM in AWS-LC use an "H-Table" to precompute and cache common computations across multiple invocations of AES-GCM using the same key, thereby improving performance. The main example of such common precomputation is the computation of powers of the H-value used in the GHASH algorithm -- giving the H-Table its name. However, despite the name, the structure of the H-Table is opaque to the code invoking AES-GCM, and implementations are free to populate it with arbitrary data. This freedom is already being leveraged: Currently, the AArch64 implementation of AES-GCM not only stores powers of H in the HTable (H1-H8 in the code), but also their 'Karatsuba preprocessing's, which are the EORs of the low and high halves. Those naturally occur when using Karatsuba's algorithm to reduce a 128-bit polynomial multiplication over GF(2) to 3x 64-bit polynomial. This commit changes the structure of the H-Table for AArch64 implementations of AES-GCM slightly to obtain a small performance gain: It is observed that every time a power of H is loaded from the H-Table (H1-H8), the first operation that happens to it in both aesv8-gcm-armv8.pl and aesv8-gcm-armv8-unroll8.pl is to swap low and high halves via `ext arg.16b, arg.16b, arg.16b, #8`. Those swaps can be precomputed, and the H{1-8} values stored in swapped form in the HTable, thereby eliminating the swaps from the critical loop of AES-GCM. This gives a small performance gain for AES-GCM on Graviton3, at the cost of slightly slower one-off initialization. For Graviton2, the AES-GCM AArch64 assembly loads the H-table only once, outside of the critical loop; hence, there is no performance benefit.
- Loading branch information
1 parent
a0e8da9
commit 90315e2
Showing
14 changed files
with
188 additions
and
742 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.