Skip to content

Commit

Permalink
Benchmarks (#84)
Browse files Browse the repository at this point in the history
Add benchmarks for the TLS handshake and transport.
  • Loading branch information
franziskuskiefer authored Jan 10, 2024
1 parent 45ac992 commit 32ff418
Show file tree
Hide file tree
Showing 31 changed files with 7,035 additions and 6,137 deletions.
24 changes: 14 additions & 10 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,6 @@ jobs:
repository: google/boringssl
path: boringssl

- name: Setup Ubuntu
if: matrix.os == 'ubuntu-latest'
run: |
sudo apt-get -y update
sudo apt-get -y install ninja-build
- name: Setup macOS
if: matrix.os == 'macos-latest'
run: brew install ninja

- name: Build code
if: matrix.os != 'windows-latest'
run: cargo build --workspace
Expand Down Expand Up @@ -82,6 +72,20 @@ jobs:
- name: Check for common mistakes and missed improvements
run: cargo clippy -- -D warnings

benchmark:
needs: test
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
runs-on: ${{ matrix.os }}

steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Benchmark
run: cargo bench

# interop:
# needs: test
# runs-on: ubuntu-latest
Expand Down
134 changes: 134 additions & 0 deletions Benchmarks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Bertie

Raw numbers for Bertie and instructions.

```bash
cargo bench
```

## M1 Pro

```
Client
- TLS_Chacha20Poly1305_SHA256 w/ EcdsaSecp256r1Sha256 | Secp256r1:
Handshake: 1124 μs | 889.5279890515473 /s
Application: 57 μs | 271.88892941363173 MB/s
- TLS_Chacha20Poly1305_SHA256 w/ EcdsaSecp256r1Sha256 | X25519:
Handshake: 662 μs | 1508.360146446083 /s
Application: 52 μs | 295.2477303244078 MB/s
- TLS_Chacha20Poly1305_SHA256 w/ RsaPssRsaSha256 | Secp256r1:
Handshake: 3325 μs | 300.72603444463067 /s
Application: 55 μs | 282.3881598304334 MB/s
- TLS_Chacha20Poly1305_SHA256 w/ RsaPssRsaSha256 | X25519:
Handshake: 2858 μs | 349.87798320226005 /s
Application: 55 μs | 280.6845217351315 MB/s
Server
- TLS_Chacha20Poly1305_SHA256 w/ EcdsaSecp256r1Sha256 | Secp256r1:
Handshake: 848 μs | 1178.1233150627613 /s
Application: 50 μs | 309.3497063306116 MB/s
- TLS_Chacha20Poly1305_SHA256 w/ EcdsaSecp256r1Sha256 | X25519:
Handshake: 373 μs | 2676.785056943784 /s
Application: 50 μs | 311.51325062731297 MB/s
- TLS_Chacha20Poly1305_SHA256 w/ RsaPssRsaSha256 | Secp256r1:
Handshake: 62523 μs | 15.994086881158827 /s
Application: 53 μs | 291.4021356932462 MB/s
- TLS_Chacha20Poly1305_SHA256 w/ RsaPssRsaSha256 | X25519:
Handshake: 62086 μs | 16.106602504765714 /s
Application: 52 μs | 295.57390142038156 MB/s
```

### Analysis

The following shows that the performance is dominated by the cryptographic primitives.
The protocol code in Bertie has no measurable impact on the performance.

#### TLS_Chacha20Poly1305_SHA256 w/ EcdsaSecp256r1Sha256 | Secp256r1

| Weight | Self weight | Symbol name |
| ------ | ----------- | -------------------------------------- |
| 20.8% | 4.81 Gc | FStar_UInt64_gte_mask |
| 14.1% | 3.27 Gc | FStar_UInt64_eq_mask |
| 14.1% | 3.25 Gc | bn_mul4 |
| 12.5% | 2.88 Gc | mont_reduction |
| 10.1% | 2.33 Gc | bn_add_mod4 |
| 5.4% | 1.27 Gc | sha256_update |
| 4.0% | 934.49 Mc | fsub0 |
| 3.3% | 768.91 Mc | chacha20_encrypt_block |
| 3.2% | 754.40 Mc | Hacl_Bignum_Addition_bn_add_eq_len_u64 |
| 1.5% | 361.09 Mc | poly1305_padded_32 |
| 0.8% | 203.45 Mc | bn_sqr4 |

#### TLS_Chacha20Poly1305_SHA256 w/ RsaPssRsaSha256 | X25519

| Weight | Self weight | Symbol name |
| ------ | ----------- | --------------------------------------------------------- |
| 38.6% | 27.83 Gc | Hacl_Bignum_AlmostMontgomery_bn_almost_mont_reduction_u64 |
| 15.3% | 11.03 Gc | FStar_UInt64_gte_mask |
| 13.2% | 9.50 Gc | FStar_UInt64_eq_mask |
| 9.7% | 7.04 Gc | Hacl_Bignum_Addition_bn_add_eq_len_u64 |
| 8.4% | 6.05 Gc | Hacl_Bignum_Multiplication_bn_sqr_u64 |
| 4.7% | 3.40 Gc | Hacl_Bignum_Addition_bn_sub_eq_len_u64 |
| 3.5% | 2.52 Gc | Hacl_Bignum_Karatsuba_bn_karatsuba_mul_uint64 |
| 2.7% | 1.97 Gc | Hacl_Bignum_bn_add_mod_n_u64 |
| 0.9% | 702.17 Mc | Hacl_Bignum_Karatsuba_bn_karatsuba_sqr_uint64 |

## Comparison

We compare with [Rustls](https://github.com/rustls/rustls) as it is the most popular
TLS implementation in Rust and claims to be [almost as fast as OpenSSL](https://www.memorysafety.org/blog/rustls-performance/).

- [ ] Note that simd is currently disabled on arm in libcrux.

### M1 Pro

#### Client

| | Bertie hs/s | Rustls hs/s |
| ---------------------------------------- | ----------- | ----------- |
| P-256 EcDSA TLS_Chacha20Poly1305_SHA256 | 889.52 | 3856.48 |
| X25519 EcDSA TLS_Chacha20Poly1305_SHA256 | 1508.36 | 4064.29 |
| P-256 RSA TLS_Chacha20Poly1305_SHA256 | 300.72 | 4059.82 |
| X25519 RSA TLS_Chacha20Poly1305_SHA256 | 349.87 | 4197.59 |

#### Server

| | Bertie hs/s | Rustls hs/s |
| ---------------------------------------- | ----------- | ----------- |
| P-256 EcDSA TLS_Chacha20Poly1305_SHA256 | 1178.12 | 7941.33 |
| X25519 EcDSA TLS_Chacha20Poly1305_SHA256 | 2676.78 | 8662.90 |
| P-256 RSA TLS_Chacha20Poly1305_SHA256 | 15.99 | 1260.10 |
| X25519 RSA TLS_Chacha20Poly1305_SHA256 | 16.10 | 1261.51 |

#### Send (client)

| | Bertie MB/s | Rustls MB/s |
| --------------------------- | ----------- | ----------- |
| TLS_Chacha20Poly1305_SHA256 | 271.88 | 1075.67 |
| TLS_AESGCM128_SHA256 | | |

#### Receive (server)

| | Bertie MB/s | Rustls MB/s |
| --------------------------- | ----------- | ----------- |
| TLS_Chacha20Poly1305_SHA256 | 309.34 | 1011.69 |
| TLS_AESGCM128_SHA256 | | |

# Rustls

Raw numbers for Rustls and instructions.

Get rustls and run benchmarks

```bash
git clone [email protected]:rustls/rustls.git
cd rustls
cargo run -p rustls --release --example bench
```

Note that by default ChachaPoly is only benchmarked with RSA.
The EcDSA variant can be added in `bench_impl.rs`.

Further, it does not seem possible to select the key exchange algorithm.
The crypto providers in Rustls define an order of `x25519, P256, P384`, such that
only `x25519` is used by default.
10 changes: 9 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,14 @@ serde = { version = "1.0", features = ["derive"] }
rayon = "1.3.0"
criterion = "0.5"

[[bench]]
name = "client"
harness = false

[[bench]]
name = "server"
harness = false

[workspace]
members = [
".",
Expand All @@ -52,5 +60,5 @@ default-members = [
"integration_tests",
]

# [patch.'https://github.com/cryspen/libcrux']
# [patch.crates-io]
# libcrux = { path = "../libcrux" }
123 changes: 123 additions & 0 deletions benches/client.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
use std::time::{Duration, Instant};

use bertie::{
stream::init_db,
tls13crypto::{
Algorithms,
// SHA256_Aes128Gcm_EcdsaSecp256r1Sha256_P256,
// SHA256_Aes128Gcm_EcdsaSecp256r1Sha256_X25519,
// SHA256_Aes128Gcm_RsaPssRsaSha256_P256,
// SHA256_Aes128Gcm_RsaPssRsaSha256_X25519,
// SHA384_Aes256Gcm_EcdsaSecp256r1Sha256_P256,
// SHA384_Aes256Gcm_EcdsaSecp256r1Sha256_X25519,
// SHA384_Aes256Gcm_RsaPssRsaSha256_P256,
// SHA384_Aes256Gcm_RsaPssRsaSha256_X25519,
SHA256_Chacha20Poly1305_EcdsaSecp256r1Sha256_P256,
SHA256_Chacha20Poly1305_EcdsaSecp256r1Sha256_X25519,
SHA256_Chacha20Poly1305_RsaPssRsaSha256_P256,
SHA256_Chacha20Poly1305_RsaPssRsaSha256_X25519,
SignatureScheme,
},
tls13utils::Bytes,
Client, Server,
};
use libcrux::{digest, drbg::Drbg};

fn hs_per_second(d: Duration) -> f64 {
// ITERATIONS per d
let d = d.as_nanos() as f64 / 1_000_000_000.0;
ITERATIONS as f64 / d
}

fn mb_per_second(d: Duration) -> f64 {
// ITERATIONS per d
// NUM_PAYLOAD_BYTES / 1024 / 1024 per iteration
let d = d.as_nanos() as f64 / 1_000_000_000.0;
let iteration = d / ITERATIONS as f64;
(NUM_PAYLOAD_BYTES as f64 / 1024.0 / 1024.0) / iteration
}

const ITERATIONS: usize = 500;
const NUM_PAYLOAD_BYTES: usize = 0x4000;
const CIPHERSUITES: [Algorithms; 4] = [
// SHA256_Aes128Gcm_EcdsaSecp256r1Sha256_P256,
// SHA256_Aes128Gcm_EcdsaSecp256r1Sha256_X25519,
// SHA256_Aes128Gcm_RsaPssRsaSha256_P256,
// SHA256_Aes128Gcm_RsaPssRsaSha256_X25519,
SHA256_Chacha20Poly1305_EcdsaSecp256r1Sha256_P256,
SHA256_Chacha20Poly1305_EcdsaSecp256r1Sha256_X25519,
SHA256_Chacha20Poly1305_RsaPssRsaSha256_P256,
SHA256_Chacha20Poly1305_RsaPssRsaSha256_X25519,
// SHA384_Aes256Gcm_EcdsaSecp256r1Sha256_P256,
// SHA384_Aes256Gcm_EcdsaSecp256r1Sha256_X25519,
// SHA384_Aes256Gcm_RsaPssRsaSha256_P256,
// SHA384_Aes256Gcm_RsaPssRsaSha256_X25519,
];

fn main() {
println!("Client");

for ciphersuite in CIPHERSUITES {
let mut rng = Drbg::new(digest::Algorithm::Sha256).unwrap();

// Server
let server_name_str = "localhost";
let server_name: Bytes = server_name_str.as_bytes().into();

let (cert_file, key_file) = match ciphersuite.signature() {
SignatureScheme::EcdsaSecp256r1Sha256 => {
("tests/assets/p256_cert.der", "tests/assets/p256_key.der")
}
SignatureScheme::RsaPssRsaSha256 => {
("tests/assets/rsa_cert.der", "tests/assets/rsa_key.der")
}
_ => unreachable!("Unknown ciphersuite {:?}", ciphersuite),
};
let db = init_db(server_name_str, key_file, cert_file).unwrap();

let mut handshake_time = Duration::ZERO;
let mut application_time = Duration::ZERO;
let payload = rng.generate_vec(NUM_PAYLOAD_BYTES).unwrap();

for _ in 0..ITERATIONS {
let start_time = Instant::now();
let (client_hello, client) =
Client::connect(ciphersuite, &server_name, None, None, &mut rng).unwrap();
let end_time = Instant::now();
handshake_time += end_time.duration_since(start_time);

let (server_hello, server_finished, server) =
Server::accept(ciphersuite, db.clone(), &client_hello, &mut rng).unwrap();

let start_time = Instant::now();
let (_client_msg, client) = client.read_handshake(&Bytes::from(server_hello)).unwrap();
let (client_msg, client) = client
.read_handshake(&Bytes::from(server_finished))
.unwrap();
let end_time = Instant::now();
handshake_time += end_time.duration_since(start_time);

let server = server.read_handshake(&client_msg.unwrap()).unwrap();

let application_data = payload.clone().into();

let start_time = Instant::now();
let (c_msg_bytes, _client) = client.write(application_data).unwrap();
let end_time = Instant::now();
application_time += end_time.duration_since(start_time);

let (msg, _server) = server.read(&c_msg_bytes).unwrap();

assert_eq!(msg.unwrap().as_raw().declassify(), payload);
}

println!(
" - {}:\n\tHandshake: {} μs | {} /s \n\tApplication: {} μs | {} MB/s",
ciphersuite,
handshake_time.as_micros() / (ITERATIONS as u128),
hs_per_second(handshake_time),
application_time.as_micros() / (ITERATIONS as u128),
mb_per_second(application_time),
);
}
}
Loading

0 comments on commit 32ff418

Please sign in to comment.