Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a script to find regression test threshold for harnesses #4724

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions tests/regression/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,3 +138,16 @@ Performs an RSA handshake in s2n-tls and validates the handshake process utilizi
### test_session_resumption

Performs an RSA handshake with server authentication. Then, performs a resumption handshake using the session ticket obtained from the previous handshake.

## Contributing Test Harnesses

To contribute to the test harnesses, you should define a test name, wrap the test code in the valgrind_test() function, and find the threshold for the test. To find the threshold, follow this workflow:

1. Navigate to the find_threshold directory `cd find_threshold`
2. Set the commit id to your current commit id, this lets the script find the profile when cargo test is run.
`EXPORT commit_id=#commit_id`
3. Run `cargo run "your_test_name"`
4. The script should run and output the threshold for your test along with a .csv file with each trial run result.

### Threshold Setting Philosophy
The find_threshold script runs the test 100 times to find the range of instruction count outputs. Since there could be non-determinism attributed to a particular test, this script helps to find that threshold. By finding `range/minimum_value` we set an upper bound for percentage differences that can be attributed to non-determinism. Now, when a change is introduced that exceeds that percentage threshold we can be confident it is a regression and not the result of non-determinism.
6 changes: 6 additions & 0 deletions tests/regression/find_threshold/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[package]
name = "find_threshold"
version = "0.1.0"
edition = "2021"

[dependencies]
94 changes: 94 additions & 0 deletions tests/regression/find_threshold/src/main.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0

use std::env;
use std::fs::File;
use std::io::{self, BufRead, BufReader, Write};
use std::process::{Command, Stdio};

fn find_instruction_count(output: &str) -> Result<i64, io::Error> {
let reader = BufReader::new(output.as_bytes());
for line in reader.lines() {
let line = line?;
if line.contains("PROGRAM TOTALS") {
if let Some(instructions) = line.split_whitespace().next() {
return instructions
.replace(',', "")
.parse::<i64>()
.map_err(|e| io::Error::new(io::ErrorKind::InvalidData, e));
}
}
}
panic!("Failed to find instruction count in annotated file");
}

fn main() -> Result<(), io::Error> {
// Get the test name from the command line arguments
let args: Vec<String> = env::args().collect();
if args.len() != 2 {
eprintln!("Usage: cargo run <test_name>");
std::process::exit(1);
}
let test_name = &args[1];

// Get the commit ID from the environment
let commit_id = env::var("COMMIT_ID").expect("COMMIT_ID environment variable not set");

// Define the path to the annotated file
let file_path = format!("target/regression_artifacts/{commit_id}/{test_name}.annotated");

// Change the working directory to the parent directory
let current_dir = env::current_dir()?;
let parent_dir = current_dir
.parent()
.expect("Failed to find parent directory");
env::set_current_dir(parent_dir)?;

let output_file_path = format!("{test_name}_instruction_counts.csv");
let mut output_file = File::create(output_file_path)?;
writeln!(output_file, "Run,Instruction Count")?;

let mut instruction_counts = Vec::new();

for run_number in 1..=100 {
// Set the environment variable, run the test
Command::new("cargo")
.arg("test")
.env("PERF_MODE", "valgrind")
.stderr(Stdio::null())
.output()
.expect("Failed to run cargo test");

// Read the file contents
let file_content = std::fs::read_to_string(&file_path)?;
// Find the instruction count
match find_instruction_count(&file_content) {
Ok(instruction_count) => {
instruction_counts.push(instruction_count);
writeln!(output_file, "{run_number},{instruction_count}")?;
println!(
"Run {run_number}: Instruction Count = {instruction_count}"
);
}
Err(e) => {
eprintln!("Failed to find instruction count in {file_path}: {e}");
instruction_counts.push(-1);
}
}
}

// Calculate the range, minimum, and percentage variance
if let (Some(&min), Some(&max)) = (
instruction_counts.iter().min(),
instruction_counts.iter().max(),
) {
let range = max - min;
let percentage_variance = (range as f64 / min as f64) * 100.0;
println!("Instruction Count Range: {range}");
println!("Percentage Variance: {:.6}%", percentage_variance);
} else {
eprintln!("Could not calculate range and percentage variance due to insufficient data.");
}

Ok(())
}
8 changes: 4 additions & 4 deletions tests/regression/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -301,7 +301,7 @@ mod tests {
let diff = find_instruction_count(&diff_content)
.expect("Failed to parse cg_annotate --diff output");
// percentage difference is the overall difference divided by the previous instruction count
let diff_percentage = diff as f64 / self.prev_profile_count as f64;
let diff_percentage = (diff as f64 / self.prev_profile_count as f64) * 100.0;
assert!(
diff_percentage <= max_diff,
"Instruction count difference exceeds the threshold, regression of {diff_percentage}% ({diff} instructions).
Expand Down Expand Up @@ -364,7 +364,7 @@ mod tests {
/// Test to create new config, set security policy, host_callback information, load/trust certs, and build config.
#[test]
fn test_set_config() {
valgrind_test("test_set_config", 0.01, |ctrl| {
valgrind_test("test_set_config", 0.237, |ctrl| {
ctrl.stop_instrumentation();
ctrl.start_instrumentation();
let keypair_rsa = CertKeyPair::default();
Expand All @@ -378,7 +378,7 @@ mod tests {
/// Test which creates a TestPair from config using `rsa_4096_sha512`. Only measures a pair handshake.
#[test]
fn test_rsa_handshake() {
valgrind_test("test_rsa_handshake", 0.01, |ctrl| {
valgrind_test("test_rsa_handshake", 0.0039, |ctrl| {
ctrl.stop_instrumentation();
let keypair_rsa = CertKeyPair::default();
let config = set_config(&security::DEFAULT_TLS13, keypair_rsa)?;
Expand All @@ -396,7 +396,7 @@ mod tests {
fn test_session_resumption() {
const KEY_NAME: &str = "InsecureTestKey";
const KEY_VALUE: [u8; 16] = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9, 3];
valgrind_test("test_session_resumption", 0.01, |ctrl| {
valgrind_test("test_session_resumption", 0.0043, |ctrl| {
ctrl.stop_instrumentation();
let keypair_rsa = CertKeyPair::default();

Expand Down
Loading