Releases: intel/intel-ipsec-mb
Releases · intel/intel-ipsec-mb
Bertha
NIST CAVP for v2.0: Cryptographic Algorithm Validation Program CAVP Intel® Multi-Buffer Crypto for IPSec
Full Changelog: v1.5...v2.0
General
- OpenSSF scorecard badge added.
- YASM support removed.
- CMake library only build option added.
- CET support added to CMake build.
- Replaced Makefiles with CMake as default build system.
- Man pages installation path fixed.
- Improved CMake project definitions and installation paths.
- Added FreeBSD CMake builds to workflows.
- Updated style check to clang-format version 18.
- Marked direct API for wireless algorithms (KASUMI, SNOW3G and ZUC) as deprecated,
to be removed in the next release.
Library
- AES-GCM changes
- Reduced binary size of AVX512 type 2 and AVX2 type 1 code by re-using internal GHASH functions.
- Optimized small packets for AVX512 type 2 (1 to 256 bytes).
- Removed specialized AVX512 type 1 and AVX2 type 1 is used instead.
- Implemented multiply reduce optimization for GHASH AVX2 type 1.
- Slightly improved large buffer performance for AVX2 type 1.
- Added new AVX2 type 2 implementation.
- DES, 3DES/TDES and DES-DOCSIS binary size reduction.
- reduced stack frame size for DES and DES-DOCSIS.
- re-used common transpose macros in the implementation.
- Fixed LFSR update in single buffer ZUC API implementation.
- SM4 changes:
- Added SM4-CTR and SM4-GCM SSE implementations.
- Added AVX2-SM4-NI implementation for SM4-GCM, SM4-CTR, SM4-CBC and SM4-ECB.
- SHA2-512/384 changes:
- Added SHA2-512/384 update AVX2-SHA512-NI single-buffer implementation.
- Added SHA2-512/384 and HMAC-SHA2-512/384 AVX2-SHA512-NI x2 multi-buffer implementations.
- Added SM3 and SM3-HMAC SM3-NI implementations.
- Added AES-CFB SSE type 1 and AVX512 type 2 implementations.
- Removed features:
- Removed AESNI emulation support.
- Removed AVX Type 2 implementation.
- Removed AES-CMAC, AES-CCM, AES-CBC and AES-ECB x4 and by4 implementations from SSE type 1.
- They are replaced with x8 and by8 implementations from SSE type 3.
- Removed AVX type 1 implementations: SHA/MD5, CHACHA20-POLY1305, SNOW3G and KASUMI.
- Moved remaining AVX type 1 implementations into AVX2 type 1.
- Removed AVX architecture type.
- Changed SHA1 on AVX2 type 4 architecture to use multi-buffer implementation.
- Added check for XSAVE and OSXSAVE CPUID features for any AVX architecture type.
- Extended cipher burst API support with: AES-ECB, AES-CFB.
- Extended hash burst API support with: SHA1, SHA2-384/512, AES-CMAC.
- Added AEAD burst API with AES-CCM support.
- Added new API to retrieve optimal minimum burst size for hash, cipher and AEAD API's.
Test Applications
- Reduced false positive hit ratio in the cross validation safe check mode.
- Improved performance of safe check pattern search in the cross validation tool.
- Added new test vectors to KAT application for AES-CFB, SM4-CTR and SM4-GCM.
- Added new multi-process test to exercise active-passive scenarios.
- Removed Makefile support.
- Removed AVX architecture type.
- Added tests for AES-CFB.
- Added burst API tests for SHA1, SHA2, AES-ECB, AES-CFB, AES-CMAC and AES-CCM.
- Added AES-CFB to ACVP application.
- Extended ctest infrastructure with improved test granularity.
Performance Applications
- Removed Makefile support.
- Removed AVX architecture type.
- Added display of time-box and measurement mode details at start.
- Added burst API tests for SHA1, SHA2, AES-ECB, AES-CFB, AES-CMAC and AES-CCM.
- Added new throughput test mode option to
imb-perf
.- It works together with new set time box option to report throughput for selected period of time.
- Added
imb-speed.py
tool that mimicsopenssl speed
.
Example Applications
- Removed Makefile support.
Resolved Issues
- Version 1.5 fails to build on FreeBSD 13.2 (amd64) using CMake (issue #136)
- Make CMake builds behave more "normal" (issue #141)
- printf in lib code prevents using ipsec-mb in SGX environment (issue #142)
- EEA3(ZUC) 1 Buffer implementation LFSR update can result in invalid LFSR state, causing incorrect keystream generation (issue #144)
- Possible regression: init_mb_mgr_avx() corrupts state on Windows (issue #147)
- Crash seen on VMware with dpdk crypto using ipsec-mb library (issue #153)
Gotthard-Basistunnel
NIST CAVP for v1.5: Cryptographic Algorithm Validation Program CAVP Intel® Multi-Buffer Crypto for IPSec
Full Changelog: v1.4...v1.5
General
- CMake MinGW support added.
Library
- QUIC CHACHA20-POLY1305 and CHACHA20 HP API added.
- AVX2-VAES AES-CTR implementation added.
- SM4-ECB SSE implementation added.
- SM4-CBC SSE implementation added.
- x86-64 SM3 and SM3-HMAC implementation added.
- Self-Test callback functionality added with message corrupt option.
- Implemented AES-GCM with VAES AVX2.
- Implemented AES-CTR with VAES AVX2.
- Implemented a workaround for false load-block condition in SSE AES-CBC implementations.
- Optimized CRC32 algorithms.
- Optimized AES-GCM AVX2 and AVX512 implementations.
Test Applications
- QUIC CHACHA20-POLY1305 and CHACHA20 HP tests added.
- SM4-ECB and SM4-CBC tests added.
- SM3 and SM3-HMAC tests added.
- Self-Test callback support added to the KAT-APP.
- Updated ACVP app (imb-acvp) to support libacvp v2.0+.
- Test vector standardized for various algorithms (CBC/CFB/CTR/ECB/DES/GCM/CCM/CHACHA20-POLY/SNOW3G/ZUC/KASUMI/SNOW-V).
- Extended xvalid app to test burst API.
Performance Application
- New parameter added to benchmark QUIC
--quic-api
. - Burst API is benchmarked by default now.
- SM4-ECB and SM4-CBC support added.
- SM3 and SM3-HMAC support added.
Resolved Issues
- CMake files ignore LIB_INSTALL_DIR and incorrectly put the shared libraries in /usr/lib (issue #125)
- the CMakefile does not install the headers (normal Makefile does) (issue #126)
- File ./test/acvp-app/utils.o is not removed after "make clean" (issue #130)
- nasm can not find .inc .asm files when building with CMake (issue #131)
Derinkuyu
NIST CAVP for v1.4: Cryptographic Algorithm Validation Program CAVP Intel® Multi-Buffer Crypto for IPSec
Full Changelog: v1.3...v1.4
General
- Experimental CMake support for Linux, FreeBSD and Windows added
Library
- POLY1305 AVX2 with AVX-IFMA instructions added.
- Optimized GHASH component in AVX512 VAES (type2) AES-GCM implementation.
- Implemented a workaround for false load-block condition in SSE and AVX2 AES-GCM implementations.
- Removed AVX AES-GCM implementation, its API symbols map to the SSE implementation.
- QUIC header protection API added.
- QUIC AES-GCM-128/256 AEAD API added.
- Removed v0.53 (and older) compatibility symbol mapping (NO_COMPAT_IMB_API_053 not defined).
- ZUC AVX2-GFNI implementation added.
- SHA-NI instructions enabled to use in SHA1/224/256 direct API
- New API (imb_set_session) added to be used with burst API, helping speeding up the crypto scheduling.
- New API added to calculate IPAD/OPAD for SHAx-HMAC.
- New direct API added to calculate DES-CFB and AES-CFB-256 on a single block.
Test Applications
- ACVP test application extended to support: AES-ECB and 3DES-CBC.
- Added sample applications showcasing how to use the new burst API.
- CMake support added, including ability to run tests with it.
- Extended fuzzing app to cover remaining direct APIs.
- Test vector standardized for various algorithms (SHA/XCBC/POLY1305/CMAC/GMAC/GHASH/HMAC-SHAx/MD5).
- Changed
test
directory structure and test application names. Each test application has its own subdirectory.
Performance Application
- New parameter added to benchmark crypto on unaligned buffers
- Renamed performance application to
imb-perf
Resolved Issues
Drammen Spiral
NIST CAVP for v1.3: Cryptographic Algorithm Validation Program CAVP Intel® Multi-Buffer Crypto for IPSec
Full Changelog: v1.2...v1.3
Library
- ZUC-EIA3-256 8-byte and 16-byte tag support added for SSE, AVX, AVX2 and AVX512
- AES-ECB AVX512-VAES implementation added
- AES-ECB optimizations for AVX and SSE
- AES-ECB AVX2-VAES implementation added
- JOB API GHASH support added
- SHA1/224/256/384/512 multi-buffer implementation added
- Multi-buffer SHA1, SHA224 and SHA256 use SHANI if available
- Synchronous cipher and hash burst API added
- cipher API only supports AES-CBC and AES-CTR
- hash API only supports HMAC-SHA1, HMAC-224, HMAC-256, HMAC-384 and HMAC-512
- Asynchronous burst API added that supports all cipher and hash modes
- SNOW3G-UEA2 SSE multi-buffer implementation added
- SNOW3G-UIA2 SSE multi-buffer initialization and key-stream generation added
- SNOW3G-UEA2 and SNOW3G-UIA2 SSE implementation used in JOB API for
AVX and AVX2 architectures - API documentation added (doxygen generated)
- New SGL job API (AES-GCM and CHACHA20-POLY1305 only)
- Enforced EVEX PMADD52 encoding in AVX512 code
- Restructured reset flow of architecture managers
- SSE, AVX, AVX2 and AVX512 managers were split to better cover different types
- Added library self-test functionality
- enbranch64 not emitted on Windows builds (CET related)
- use SHANI extensions in AVX2 type-2 and AVX type-2 for SHA224, HMAC-SHA224,
SHA256 and HMAC-SHA256 - use SHANI extensions in AVX type-2 for SHA1, HMAC-SHA1
- no-GFNI option added to help with testing
Test Applications
- GHASH JOB API support added in the test application, fuzzing and xvalid tools
- Burst API support added for supported algorithms
- ACVP test application extended to support: AES-GCM, AES-GMAC, AES-CCM,
AES-CBC, AES-CTR, AES-CMAC, SHA1, SHA224, SHA256, SHA384, SHA512, HMAC-SHA1,
HMAC-SHA224, HMAC-SHA256, HMAC-SHA384, HMAC-SHA512 - Cross validation (xvalid) tool improvements in pattern search functionality
- FreeBSD added to github CI
- Added AVX-SSE transition check to the cross validation tool (xvalid)
- Wycheproof AES-GCM, AES-CCM, CHACHA20-POLY1305, AES-CMAC, AES-GMAC, HMAC-SHA1,
HMAC-SHA224, HMAC-SHA256, HMAC-SHA384 and HMAC-SHA512 test vectors added
to a new test tool - no-GFNI option added
Performance Application
- GHASH support added (through JOB and direct API)
- CHACHA20-POLY1305 support through direct API
- Support added for SHA1/224/256/384/512
- Burst API support added for supported algorithms
- SGL support added (AES-GCM and CHACHA20-POLY1305 only)
- no-GFNI option added
Resolved Issues
- Fixed 23-byte IV expansion for ZUC-256 (issue #102)
- Fixed incorrect 8-buffer SNOW3G key-stream generation (issue #104)
- Numerous AVX-SSE transition fixes with SAFE_OPTIONS=n
- [ZUC-EIA3] allow unaligned digest load/stores
- AES-CCM authentication flush may load out of scope data (issue #107)
- AES-CMAC authentication flush may load out of scope data (similar to issue #107)
Moll's Gap Tunnel
Full Changelog: v1.1...v1.2
General
- Windows CET support
- Disable build of AESNI emulation support by default and make it optional
Performance Application
- SGL API support for GCM added
Library
- Generation of PDB in release build on Windows added
- SAFE_OPTIONS option added to unify control of SAFE_DATA, SAFE_PARAM, SAFE_LOOKUP options
- Improved performance of SAFE_DATA=y
Test Applications
- Extended invalid IV length tests
- Test application improvements
- Fuzz testing tool improvements
- Auto-generation of direct API invalid parameters tests added
- ACVP test application added
Resolved Issues
- Fixed incorrect job length calculation in CBCS encryption
- Fixed FreeBSD build (#94)
- Added missing checks for HMAC IPAD and OPAD
- Added missing checks for XCBC K1, K2 and K3
Known Issues
Azna Snow Tunnel
Full Changelog: v1.0...v1.1
General
- Added support to build with Mingw-w64 on Windows
Library
- PON algorithm AVX512-VAES implementation added
- SNOW3G-UIA2 AVX512 and AVX512-VPCLMULQDQ implementations added
- SNOW3G-UEA2 AVX512 and AVX512-VAES implementations added
- SNOW-V AVX implementation added
- ZUC optimizations for AVX512
- GCM optimizations for AVX512
- Poly1305 optimizations for AVX512-VAES
- Improved error code handling
- ZUC-256 23-byte IV support added
Test Applications
- Error handling tests added (for job and direct API)
- Fuzz testing added
Resolved Issues
The Eysturoy Tunnel
General
- Top level
lib
directory tidy up- build scripts and header file left at the top level
lib/x86_64
directory created- files requiring compilation moved from
lib/include
- Symbols not stripped from static library at installation
- API name changes and unification
- mapping provided for backwards compatibility
- NASM version check in the build script
- CET enabling in the build scripts
Library
- CET enabling (endbranch opcodes added)
- ZUC-EIA3-256 support for SSE, AVX, AVX2 and AVX512 (VAES)
- 4 byte tag length only
- Chacha20 optimizations for SSE, AVX and AVX2
- ZUC-EEA3-256 support for SSE, AVX, AVX2 and AVX512 (VAES)
- SNOW-V and SNOW-V-AEAD support for SSE
- Poly1305 AVX512 and AVX512-IFMA implementations added
- Chacha20-Poly1305 AEAD implementations extended to AVX512 and AVX512-IFMA
- CBCS AVX512 optimizations
- Extended CBCS to return last cipher block to maintain context between calls
- AVX/SSE transition fixes
- Added SGL support for AEAD Chacha20-Poly1305
- Poly1305 minor optimization in the scalar code
- GHASH API change
- IFMA CPU feature detection
- SGL support added for AES-GCM through job API
- Added CRC functions through job API
Test Applications
- ZUC-EEA3-256 tests added to test and xvalidation applications
- SNOW-V and SNOW-V-AEAD tests added to test and xvalidation applications
- IMIX support added to the xvalidation application
- AEAD Chacha20-Poly1305 tests added
Performance Application
- ZUC-EEA3-256 support added
- ZUC-EIA3-256 support added
- SNOW-V and SNOW-V-AEAD support added
- AEAD Chacha20-Poly1305 support added
- Created
ipsec_perf_tool.py
to run multipleipsec_perf
instances at the same time - DOCSIS cipher combined with CRC32 treated as AEAD algorithm
- CRC functions added
API Changes
- #71 IMB_GHASH API now takes an input digest from the fifth argument and outputs the new digest on that argument too.
Resolved Issues
- #73 static analysis warnings on today's tip
- #75 build errors when building on Ubuntu 21.04 Hirsute Hippo
- #76 potential uninitialized value error found by static analysis #76
- #77 dead code warnings from static analysis on test/api_test.c
- #78 deadcode warnings in perf/ipsec_perf.c
- #79 potential logical error in if statements
- #80 potential infinite for-loop
- #81 potential null pointer dereference if malloc fails
Lunar Lava Tube
General
- Restructured project to move all library code into new
lib
directory - Renamed
LibPerfApp
directory toperf
- Renamed
LibTestApp
directory totest
- Improved FreeBSD support (port)
- Faster Windows rebuild time (dependencies)
Library
- AES-CCM-256 implementation for SSE, AVX and AVX512 (VAES)
- AES-CMAC-256 implementation for SSE, AVX and AVX512 (VAES)
- 32bit and 64bit HEC compute API added
- AES-GMAC direct API added to support Scatter-Gather list (SGL)
- CALC_AAD_HASH macro improved for AVX512 (VAES), boosting performance
for AES-GMAC, GHASH and hash calculation for AAD in AES-GCM - ZUC-EEA3 and ZUC-EIA3 Multi-buffer implemented for SSE using GFNI instructions
- AES-XCBC-128 implementation for AVX512 (VAES)
- AES-CBCS-128 implementation for SSE, AVX and AVX512 VAES (1:9 crypt:skip pattern)
- Chacha20 SSE, AVX and AVX512 implementations
- Automatic multi-buffer manager initialization API added (auto feature detection)
- Error handling API added
- Improved input parameter checking
- Build with SAFE_DATA and SAFE_PARAM options by default (can be turned off if required)
- Poly1305 scalar implementation
- AEAD Chacha20-Poly1305 implementation
- CRC implementation for RNC, LTE, WiMAX, SCTP, Ethernet and CRC16 CCIT
- Optimizations for ZUC, AVX512 VAES AES-CBC encryption and SNOW3G authentication
- Optimizations of DOCSIS + CRC32 for AVX512 VAES
- Faster PCLMULQDQ emulation for platforms that don't support it
Test Applications
- CCM tests extended to test AES-CCM-256
- CMAC tests extended to test AES-CMAC-256
- HEC tests added to test app
- AES-GMAC SGL tests added to test app
- AES-XCBC-128 tests added to test app
- AES-CBCS-128 tests added to test app
- AES-CBCS-128 support added to cross validation app
- Chacha20 tests added to test and cross validation app
- Poly1305 tests added to test and cross validation app
- AEAD Chacha20-Poly1305 tests added to test and cross validation app
- CRC tests added
- Automatic architecture detection done by default
- Unified test result reporting
- Extended negative tests
Performance Application
- AES-CCM-256 support added
- AES-CMAC-256 support added
- AES-CBCS-128 support added
- Chacha20 support added
- Poly1305 support added
- AEAD Chacha20-Poly1305 support added
- Packet mix option added
Resolved Issues
- #47 Wrong tag calculation on GCM/GMAC when AAD size >= 512 MB
- #48 Potentially unsupported MOVBE instruction used in HEC compute API
- #49 redundant assignment of bytes_left
- #51 Many errors when building with gcc-10
- #54 Uninitialized pointer reads on arrays pSrcData and pDstData
- #55 Memory leak of allocations pointed to by variant_list[i].avg_times
- #56 Suggestion: maybe memset'ing job_template in do_test in /perf/ipsec_perf.c is a good idea
- #57 Possible cut-n-paste typo in snow3g_test.c
- #58 Windows compilation broken with DEBUG=y
- #59 IMB_SNOW3G_F8_4_BUFFER() Direct API failing on Windows with VS2017+
- #63 Found a bunch of spelling mistakes.
- #69 freebsd: error: duplicate symbol: imb_errno
- #70 VAES DOCSIS encryption combined with CRC32 flush problem (13 or more jobs)
Wuhan Yangtze River Tunnel
General
- ZUC-EEA3 and ZUC-EIA3
- ZUC-EEA3 and ZUC-EIA3 algorithms added in job API (using cipher mode IMB_CIPHER_ZUC_EEA3 and hash_alg IMB_AUTH_ZUC_EIA3_BITLEN)
- ZUC-EIA3 Multi-buffer API added and implemented for SSE and AVX.
- ZUC-EEA3 and ZUC-EIA3 Multi-buffer implemented for AVX2 and AVX512
- For AVX512, using latter GFNI and VAES instructions where these are present in the CPU
- ZUC-EEA3 and ZUC-EIA3 Multi-buffer implemented with AESNI emulation instructions.
- SNOW3G-UEA2 and SNOW3G-UIA2
- SNOW3G-UEA2 and SNOW3G-UIA2 algorithms added in job API (using cipher type IMB_CIPHER_SNOW3G_UEA2_BITLEN and hash type IMB_AUTH_SNOW3G_UIA2_BITLEN)
- SNOW3G-UIA2 and SNOW3G-UEA2 reimplemented for increased security and performance.
- KASUMI-UEA1 and KASUMI-UIA1 algorithms added in job API (using cipher type IMB_CIPHER_KASUMI_UEA1_BITLEN and hash type IMB_AUTH_KASUMI_UIA1)
- DOCSIS
- AVX512 implementation of stitched DOCSIS cipher with CRC32 calculations
- AES256-DOCSIS algorithm added.
- New GHASH API added
- Added support for any IV size in AES-GCM, through the job API and new direct API
- VAES related
- AES-CMAC implementation for VAES added
- AES-CBC improvement for VAES
- AES-CCM implementation for VAES added
- SSE AES by8/x8 implementations added
- AES128-CTR, AES192-CTR, AES256-CTR and AES128-CCM (by8)
- AES128-CBC, AES192-CBC, AES256-CBC, DOCSIS SEC BPI and AES-CMAC (x8)
- AES-CCM (by8 and x8)
- Build
- Check for new flag NO_COMPAT_IMB_API_053, which exposes only new API, removing backwards compatibility with version v0.53
- Minimum required version for NASM is now 2.14.
- Removed NO_GCM compile flag
- Removed GCM_BIG_DATA compile flag
LibTestApp
- Extended ZUC tests to validate ZUC-EEA3 and ZUC-EIA3 algorithms through job API
- Extended SNOW3G tests to validate SNOW3G-UEA2 and SNOW3G-UIA2 algorithms through job API
- Extended DOCSIS tests with combined CRC32 calculation cases
- Extended KASUMI tests to validate KASUMI-UEA1 and KASUMI-UIA1 algorithms through job API
- Extended ZUC tests to validate ZUC-EIA3 multi-buffer implementation through direct and job API
- Extended AES-DOCSIS tests with 256-bit keys
LibPerfApp
- Added support for ZUC-EEA3 and ZUC-EIA3 algorithms
- Added support for SNOW3G-UEA2 and SNOW3G-UIA2 algorithms
- Added support for DOCSIS combined with CRC32
- Added support for KASUMI-UEA1 and KASUMI-UIA1 algorithms
Performance
- ZUC performance improvements
- AES-CCM, AES-CMAC implemented for VAES
- AES-CBC improvement for VAES
- SSE by8/x8 implementations of AES-CBC, AES-CTR, AES-CCM, AES-CMAC and AES-DOCSIS
Resolved Issues
#40 CentOS 7 & gcc4.8 compilation problem
#41 uint128_t definition in /usr/include/intel-ipsec-mb.h clashes with /usr/include/bluetooth/bluetooth.h
#43 Block count may be incremented incorrectly in AES-CTR bug
Tunèl de Vielha
General
- AES-CCM performance optimizations done
- full assembly implementation
- authentication decoupled from cipher
- CCM chain order expected to be HASH_CIPHER for encryption and CIPHER_HASH for decryption
- AES-CTR implementation for VAES added
- AES-CBC implementation for VAES added
- Single buffer AES-GCM performance improvements added for VPCLMULQDQ + VAES
- Multi-buffer AES-GCM implementation removed
- Data transposition optimizations and unification across the library implemented
- Generation of make dependency files for Linux added
- AES-ECB implementation added
- PON specific stitched algorithm implementation added
- stitched AES-CTR-128 (optional) with CRC32 and BIP (running 32-bit XOR)
- AES-CMAC-128 implementation for bit length messages added
- ZUC-EEA3 and ZUC-EIA3 implementation added
- FreeBSD experimental support added
- KASUMI-F8 and KASUMI-F9 implementation added
- SNOW3G-UEA2 and SNOW3G-UIA2 implementation added
- AES-CTR implementation for bit length (128-NEA2/192-NEA2/256-NEA2) messages added
- SAFE_PARAM, SAFE_DATA and SAFE_LOOKUP compile time options added. Find more about these options at https://github.com/intel/intel-ipsec-mb/blob/master/README.
LibTestApp
- CMAC test vectors extended
- New chained operation tests added
- Out-of-place chained operation tests added
- AES-ECB tests added
- PON algorithm tests added
- Extra AES-CTR test vectors added
- Extra AES-CBC test vectors added
- AES-CMAC-128 bit length message tests added
- CPU capability detection used to disable tests if instruction not present
- ZUC-EEA3 and ZUC-EIA3 tests added
- New cross architecture test application (ipsec_xvalid) added, which mixes different implementations (based on different architectures), to double check their correctness
- SNOW3G-UEA2 and SNOW3G-UIA2 tests added
- AES-CTR-128 bit length message tests added
- Negative tests extended to cover all API's
LibPerfApp
- Job size and number of iterations options added
- Single architecture test option added
- AAD size option added
- Allow zero length source buffer option added
- Custom performance test combination added: cipher-algo, hash-algo and aead-algo arguments.
- Cipher direction option added
- The maximum buffer size extended from 2K to 16K
- Support for user defined range of job sizes added
Performance
- AES-CCM optimized
- New AES-GCM, AES-CBC and AES-CTR implementations for VAES and VPCLMULQDQ extensions