Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: Add CHACHA20-POLY1305 and TLS_CHACHA20_POLY1305_SHA256 for speed on Raspberry Pi etc #1244

Open
brozkeff opened this issue Aug 17, 2024 · 1 comment

Comments

@brozkeff
Copy link

brozkeff commented Aug 17, 2024

Modern versions of OpenSSL and OpenVPN support new ciphers based on ChaCha20-Poly1305.

Using them instead of AES-GCM dramatically improves performance especially on Raspberry Pis or other low-end CPU swhich do not have HW AES acceleration.

Supported from OpenVPN2.5+ and compatible SSL library needed so allowing such ciphers as optional should check if these are supported by the running system.

Below is a quick and dirty patch I used to test it and deploy on RPi (just overwriting the default parameters). Proper way would be of course new selectable option, checks if algos are supported on the particular system, etc.

diff ./original/openvpn-install.sh ./patched/openvpn-install.sh 
393,394c393,394
< 		# Use default, sane and fast parameters
< 		CIPHER="AES-128-GCM"
---
> 		# Use modern ChaCha20-Poly1305 for faster operation on Raspberry Pi or other low-end HW
> 		CIPHER="CHACHA20-POLY1305"
397c397
< 		CC_CIPHER="TLS-ECDHE-ECDSA-WITH-AES-128-GCM-SHA256"
---
> 		CC_CIPHER="TLS-ECDHE-ECDSA-WITH-CHACHA20-POLY1305-SHA256"

Test comparion on RPi4 1GB RAM shows ChaCha20-Poly1305 is cca 4.5× faster than AES-128-GCM:

$ openssl speed -elapsed -evp aes-128-gcm
You have chosen to measure elapsed time instead of user CPU time.
Doing AES-128-GCM for 3s on 16 size blocks: 8399910 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 64 size blocks: 2530334 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 762014 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 200959 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 8192 size blocks: 25868 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 12978 AES-128-GCM's in 3.00s
version: 3.0.13
built on: Mon Apr 29 14:39:02 2024 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -fzero-call-used-regs=used-gpr -DOPENSSL_TLS_SECURITY_LEVEL=2 -Wa,--noexecstack -g -O2 -ffile-prefix-map=/build/openssl-928mA1/openssl-3.0.13=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
CPUINFO: OPENSSL_armcap=0x81
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-128-GCM      44799.52k    53980.46k    65025.19k    68594.01k    70636.89k    70877.18k

$ openssl speed -elapsed -evp chacha20-poly1305
You have chosen to measure elapsed time instead of user CPU time.
Doing ChaCha20-Poly1305 for 3s on 16 size blocks: 15080966 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 64 size blocks: 6477209 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 256 size blocks: 2940868 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 1024 size blocks: 909247 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 8192 size blocks: 118162 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 16384 size blocks: 58899 ChaCha20-Poly1305's in 3.00s
version: 3.0.13
built on: Mon Apr 29 14:39:02 2024 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -fzero-call-used-regs=used-gpr -DOPENSSL_TLS_SECURITY_LEVEL=2 -Wa,--noexecstack -g -O2 -ffile-prefix-map=/build/openssl-928mA1/openssl-3.0.13=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
CPUINFO: OPENSSL_armcap=0x81
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
ChaCha20-Poly1305    80431.82k   138180.46k   250954.07k   310356.31k   322661.03k   321667.07k
@Weera1234
Copy link

Modern versions of OpenSSL and OpenVPN support new ciphers based on ChaCha20-Poly1305.

Using them instead of AES-GCM dramatically improves performance especially on Raspberry Pis or other low-end CPU swhich do not have HW AES acceleration.

Supported from OpenVPN2.5+ and compatible SSL library needed so allowing such ciphers as optional should check if these are supported by the running system.

Below is a quick and dirty patch I used to test it and deploy on RPi (just overwriting the default parameters). Proper way would be of course new selectable option, checks if algos are supported on the particular system, etc.

diff ./original/openvpn-install.sh ./patched/openvpn-install.sh 
393,394c393,394
< 		# Use default, sane and fast parameters
< 		CIPHER="AES-128-GCM"
---
> 		# Use modern ChaCha20-Poly1305 for faster operation on Raspberry Pi or other low-end HW
> 		CIPHER="CHACHA20-POLY1305"
397c397
< 		CC_CIPHER="TLS-ECDHE-ECDSA-WITH-AES-128-GCM-SHA256"
---
> 		CC_CIPHER="TLS-ECDHE-ECDSA-WITH-CHACHA20-POLY1305-SHA256"

Test comparion on RPi4 1GB RAM shows ChaCha20-Poly1305 is cca 4.5× faster than AES-128-GCM:

$ openssl speed -elapsed -evp aes-128-gcm
You have chosen to measure elapsed time instead of user CPU time.
Doing AES-128-GCM for 3s on 16 size blocks: 8399910 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 64 size blocks: 2530334 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 256 size blocks: 762014 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 1024 size blocks: 200959 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 8192 size blocks: 25868 AES-128-GCM's in 3.00s
Doing AES-128-GCM for 3s on 16384 size blocks: 12978 AES-128-GCM's in 3.00s
version: 3.0.13
built on: Mon Apr 29 14:39:02 2024 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -fzero-call-used-regs=used-gpr -DOPENSSL_TLS_SECURITY_LEVEL=2 -Wa,--noexecstack -g -O2 -ffile-prefix-map=/build/openssl-928mA1/openssl-3.0.13=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
CPUINFO: OPENSSL_armcap=0x81
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
AES-128-GCM      44799.52k    53980.46k    65025.19k    68594.01k    70636.89k    70877.18k

$ openssl speed -elapsed -evp chacha20-poly1305
You have chosen to measure elapsed time instead of user CPU time.
Doing ChaCha20-Poly1305 for 3s on 16 size blocks: 15080966 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 64 size blocks: 6477209 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 256 size blocks: 2940868 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 1024 size blocks: 909247 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 8192 size blocks: 118162 ChaCha20-Poly1305's in 3.00s
Doing ChaCha20-Poly1305 for 3s on 16384 size blocks: 58899 ChaCha20-Poly1305's in 3.00s
version: 3.0.13
built on: Mon Apr 29 14:39:02 2024 UTC
options: bn(64,64)
compiler: gcc -fPIC -pthread -Wa,--noexecstack -Wall -fzero-call-used-regs=used-gpr -DOPENSSL_TLS_SECURITY_LEVEL=2 -Wa,--noexecstack -g -O2 -ffile-prefix-map=/build/openssl-928mA1/openssl-3.0.13=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_BUILDING_OPENSSL -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
CPUINFO: OPENSSL_armcap=0x81
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
ChaCha20-Poly1305    80431.82k   138180.46k   250954.07k   310356.31k   322661.03k   321667.07k

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants