Skip to content

Commit

Permalink
Generate Kani Metrics (#235)
Browse files Browse the repository at this point in the history
Add a new "Kani Metrics" workflow (that runs every week) that calls
`./scripts/run-kani --run metrics`, then creates a pull request to this
repository with the computed metrics. See
[here](carolynzech#38) for an
example of what the pull request will look like.

Callouts:
- This is a separate workflow from the Kani workflow because it is a
cronjob instead of running on every pull request. I thought the plots
would be too noisy if we ran on every PR.
- See the "Notes" section of `kani_std_analysis.py` for other callouts.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.
  • Loading branch information
carolynzech authored Jan 14, 2025
1 parent 2b2baa8 commit b4fa4fd
Show file tree
Hide file tree
Showing 6 changed files with 384 additions and 4 deletions.
42 changes: 42 additions & 0 deletions .github/workflows/kani-metrics.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: Kani Metrics Update

on:
schedule:
- cron: '0 0 * * 0' # Run at 00:00 UTC every Sunday
workflow_dispatch:

defaults:
run:
shell: bash

jobs:
update-kani-metrics:
runs-on: ubuntu-latest

steps:
- name: Checkout Repository
uses: actions/checkout@v4
with:
submodules: true

# The Kani metrics collection uses a Python script (kani_std_analysis.py), so make sure Python is installed
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.x'

- name: Compute Kani Metrics
run: ./scripts/run-kani.sh --run metrics --path ${{github.workspace}}

- name: Create Pull Request
uses: peter-evans/create-pull-request@v7
with:
commit-message: Update Kani metrics
title: 'Update Kani Metrics'
body: |
This is an automated PR to update Kani metrics.
The metrics have been updated by running `./scripts/run-kani.sh --run metrics`.
branch: update-kani-metrics
delete-branch: true
base: main
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ package-lock.json
## Kani
*.out


# Added by cargo
#
# already existing elements were commented out
Expand Down
307 changes: 307 additions & 0 deletions scripts/kani-std-analysis/kani_std_analysis.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,307 @@
#!/usr/bin/env python3

import argparse
from collections import defaultdict
import csv
import json
import sys
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from matplotlib.ticker import FixedLocator

# Output metrics about Kani's application to the standard library by:
# 1. Postprocessing results from running `kani list`, which collects data about Kani harnesses and contracts.
# 2. Postprocessing results from std-analysis.sh, which outputs metrics about the standard library unrelated to Kani (e.g., a list of the functions in each crate).
# 3. Comparing the output of steps #1 and #2 to compare the Kani-verified portion of the standard library to the overall library,
# e.g., what fraction of unsafe functions have Kani contracts.

# Notes:
# - This script assumes that std-analysis.sh and `kani list` have already run, since we expect to invoke this script from `run-kani.sh`.
# - The results are architecture-dependent: the standard library has functions that are only present for certain architectures,
# and https://github.com/model-checking/verify-rust-std/pull/122 has Kani harnesses that only run on x86-64.
# - The total functions under contract are not necessarily equal to the sum of unsafe and safe functions under contract.
# We've added a few functions (three, as of writing) with contracts to this fork, e.g. ffi::c_str::is_null_terminated.
# Since std-analysis.sh runs on the standard library given a toolchain (not this fork), it doesn't know about this function and therefore can't classify it as safe or unsafe.
# But `kani list` runs on this fork, so it can still see it and add it to the total functions under contract.
# - See #TODOs for known limitations.

# Process the results from Kani's std-analysis.sh script for each crate.
# TODO For now, we just handle "core", but we should process all crates in the library.
class GenericSTDMetrics():
def __init__(self, results_dir):
self.results_directory = results_dir
self.unsafe_fns_count = 0
self.safe_abstractions_count = 0
self.safe_fns_count = 0
self.unsafe_fns = []
self.safe_abstractions = []
self.safe_fns = []

self.read_std_analysis()

# Read {crate}_overall_counts.csv
# and return the number of unsafe functions and safe abstractions
def read_overall_counts(self):
file_path = f"{self.results_directory}/core_scan_overall.csv"
with open(file_path, 'r') as f:
csv_reader = csv.reader(f, delimiter=';')
counts = {row[0]: int(row[1]) for row in csv_reader if len(row) >= 2}
self.unsafe_fns_count = counts.get('unsafe_fns', 0)
self.safe_abstractions_count = counts.get('safe_abstractions', 0)
self.safe_fns_count = counts.get('safe_fns', 0)

# Read {crate}_scan_functions.csv
# and return an array of the unsafe functions and the safe abstractions
def read_scan_functions(self):
expected_header_start = "name;is_unsafe;has_unsafe_ops"
file_path = f"{self.results_directory}/core_scan_functions.csv"

with open(file_path, 'r') as f:
csv_reader = csv.reader(f, delimiter=';', quotechar='"')

# The row parsing logic below assumes the column structure in expected_header_start,
# so assert that is how the header begins before continuing
header = next(csv_reader)
header_str = ';'.join(header[:3])
if not header_str.startswith(expected_header_start):
print(f"Error: Unexpected CSV header in {file_path}")
print(f"Expected header to start with: {expected_header_start}")
print(f"Actual header: {header_str}")
sys.exit(1)

for row in csv_reader:
if len(row) >= 3:
name, is_unsafe, has_unsafe_ops = row[0], row[1], row[2]
# An unsafe function is a function for which is_unsafe=true
if is_unsafe.strip() == "true":
self.unsafe_fns.append(name)
else:
assert is_unsafe.strip() == "false" # sanity check against malformed data
self.safe_fns.append(name)
# A safe abstraction is a safe function with unsafe ops
if has_unsafe_ops.strip() == "true":
self.safe_abstractions.append(name)

def read_std_analysis(self):
self.read_overall_counts()
self.read_scan_functions()

# Sanity checks
if len(self.unsafe_fns) != self.unsafe_fns_count:
print(f"Number of unsafe functions does not match core_scan_functions.csv")
print(f"UNSAFE_FNS_COUNT: {self.unsafe_fns_count}")
print(f"UNSAFE_FNS length: {len(self.unsafe_fns)}")
sys.exit(1)

if len(self.safe_abstractions) != self.safe_abstractions_count:
print(f"Number of safe abstractions does not match core_scan_functions.csv")
print(f"SAFE_ABSTRACTIONS_COUNT: {self.safe_abstractions_count}")
print(f"SAFE_ABSTRACTIONS length: {len(self.safe_abstractions)}")
sys.exit(1)

if len(self.safe_fns) != self.safe_fns_count:
print(f"Number of safe functions does not match core_scan_functions.csv")
print(f"SAFE_FNS_COUNT: {self.safe_fns_count}")
print(f"SAFE_FNS length: {len(self.safe_fns)}")
sys.exit(1)

# Process the results of running `kani list` against the standard library,
# i.e., the Kani STD metrics for a single date (whichever day this script is running).
class KaniListSTDMetrics():
def __init__(self, kani_list_filepath):
self.total_fns_under_contract = 0
# List of (fn, has_harnesses) tuples, where fn is a function under contract and has_harnesses=true iff the contract has at least one harness
self.fns_under_contract = []
self.expected_kani_list_version = "0.1"

self.read_kani_list_data(kani_list_filepath)

def read_kani_list_data(self, kani_list_filepath):
try:
with open(kani_list_filepath, 'r') as f:
kani_list_data = json.load(f)
except FileNotFoundError:
print(f"Kani list JSON file not found.")
sys.exit(1)

if kani_list_data.get("file-version") != self.expected_kani_list_version:
print(f"Kani list JSON file version does not match expected version {self.expected_kani_list_version}")
sys.exit(1)

for contract in kani_list_data.get('contracts', []):
func_under_contract = contract.get('function', '')
has_harnesses = len(contract.get('harnesses', [])) > 0
self.fns_under_contract.append((func_under_contract, has_harnesses))

self.total_fns_under_contract = kani_list_data.get('totals', {}).get('functions-under-contract', 0)

# Generate metrics about Kani's application to the standard library over time
# by reading past metrics from metrics_file, then computing today's metrics.
class KaniSTDMetricsOverTime():
def __init__(self, metrics_file):
self.dates = []
self.unsafe_metrics = ['total_unsafe_fns', 'unsafe_fns_under_contract', 'verified_unsafe_fns_under_contract']
self.safe_abstr_metrics = ['total_safe_abstractions', 'safe_abstractions_under_contract', 'verified_safe_abstractions_under_contract']
self.safe_metrics = ['total_safe_fns', 'safe_fns_under_contract', 'verified_safe_fns_under_contract']
# The keys in these dictionaries are unsafe_metrics, safe_abstr_metrics, and safe_metrics, respectively; see update_plot_metrics()
self.unsafe_plot_data = defaultdict(list)
self.safe_abstr_plot_data = defaultdict(list)
self.safe_plot_data = defaultdict(list)

self.date = datetime.today().date()
self.metrics_file = metrics_file

self.read_historical_data()

# Read historical data from self.metrics_file and initialize the date range.
def read_historical_data(self):
try:
with open(self.metrics_file, 'r') as f:
all_data = json.load(f)["results"]
self.update_plot_metrics(all_data)
except FileNotFoundError:
all_data = {}

self.dates = [datetime.strptime(data["date"], '%Y-%m-%d').date() for data in all_data]
self.dates.append(self.date)

print(f"Dates are {self.dates}\n")

# TODO For now, we don't plot how many of the contracts have been verified, since the line overlaps with how many are under contract.
# If we want to plot this data, we should probably generate separate plots.
# Update the plot data with the provided data.
def update_plot_metrics(self, all_data):
for data in all_data:
for metric in self.unsafe_metrics:
if not metric.startswith("verified"):
self.unsafe_plot_data[metric].append(data[metric])

for metric in self.safe_abstr_metrics:
if not metric.startswith("verified"):
self.safe_abstr_plot_data[metric].append(data[metric])

for metric in self.safe_metrics:
if not metric.startswith("verified"):
self.safe_plot_data[metric].append(data[metric])

# Read output from kani list and std-analysis.sh, then compare their outputs to compute Kani-specific metrics
# and write the results to {self.metrics_file}
def compute_metrics(self, kani_list_filepath, analysis_results_dir):
# Process the `kani list` and `std-analysis.sh` data
kani_data = KaniListSTDMetrics(kani_list_filepath)
generic_metrics = GenericSTDMetrics(analysis_results_dir)

print("Comparing kani-list output to std-analysis.sh output and computing metrics...")

(unsafe_fns_under_contract, verified_unsafe_fns_under_contract) = (0, 0)
(safe_abstractions_under_contract, verified_safe_abstractions_under_contract) = (0, 0)
(safe_fns_under_contract, verified_safe_fns_under_contract) = (0, 0)

for (func_under_contract, has_harnesses) in kani_data.fns_under_contract:
if func_under_contract in generic_metrics.unsafe_fns:
unsafe_fns_under_contract += 1
if has_harnesses:
verified_unsafe_fns_under_contract += 1
if func_under_contract in generic_metrics.safe_abstractions:
safe_abstractions_under_contract += 1
if has_harnesses:
verified_safe_abstractions_under_contract += 1
if func_under_contract in generic_metrics.safe_fns:
safe_fns_under_contract += 1
if has_harnesses:
verified_safe_fns_under_contract += 1

# Keep the keys here in sync with unsafe_metrics, safe_metrics, and safe_abstr_metrics
data = {
"date": self.date,
"total_unsafe_fns": generic_metrics.unsafe_fns_count,
"total_safe_abstractions": generic_metrics.safe_abstractions_count,
"total_safe_fns": generic_metrics.safe_fns_count,
"unsafe_fns_under_contract": unsafe_fns_under_contract,
"verified_unsafe_fns_under_contract": verified_unsafe_fns_under_contract,
"safe_abstractions_under_contract": safe_abstractions_under_contract,
"verified_safe_abstractions_under_contract": verified_safe_abstractions_under_contract,
"safe_fns_under_contract": safe_fns_under_contract,
"verified_safe_fns_under_contract": verified_safe_fns_under_contract,
"total_functions_under_contract": kani_data.total_fns_under_contract,
}

self.update_plot_metrics([data])

# Update self.metrics_file so that these results are included in our historical data for next time
with open(self.metrics_file, 'r') as f:
content = json.load(f)
content["results"].append(data)

with open(self.metrics_file, 'w') as f:
json.dump(content, f, indent=2, default=str)

print(f"Wrote metrics data for date {self.date} to {self.metrics_file}")

# Make a single plot with specified data, title, and filename
def plot_single(self, data, title, outfile):
plt.figure(figsize=(14, 8))

colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#946F7bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']

for i, (metric, values) in enumerate(data.items()):
color = colors[i % len(colors)]
plt.plot(self.dates, values, 'o-', color=color, label=metric, markersize=6)

# Mark each point on the line with the y value
for x, y in zip(self.dates, values):
plt.annotate(str(y),
(mdates.date2num(x), y),
xytext=(0, 5),
textcoords='offset points',
ha='center',
va='bottom',
color=color,
fontweight='bold')

plt.title(title)
plt.xlabel("Date")
plt.ylabel("Count")

# Set x-axis to only show ticks for the dates we have
plt.gca().xaxis.set_major_locator(FixedLocator(mdates.date2num(self.dates)))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))

plt.gcf().autofmt_xdate()
plt.tight_layout()
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

plt.savefig(outfile, bbox_inches='tight', dpi=300)
plt.close()

print(f"PNG graph generated: {outfile}")

def plot(self):
self.plot_single(self.unsafe_plot_data, title="Contracts on Unsafe Functions in core", outfile="core_unsafe_metrics.png")
self.plot_single(self.safe_abstr_plot_data, title="Contracts on Safe Abstractions in core", outfile="core_safe_abstractions_metrics.png")
self.plot_single(self.safe_plot_data, title="Contracts on Safe Functions in core", outfile="core_safe_metrics.png")

def main():
parser = argparse.ArgumentParser(description="Generate metrics about Kani's application to the standard library.")
parser.add_argument('--metrics-file',
type=str,
default="metrics-data.json",
help="Path to the JSON file containing metrics data (default: metrics-data.json)")
parser.add_argument('--kani-list-file',
type=str,
default="kani-list.json",
help="Path to the JSON file containing the Kani list data (default: kani-list.json)")
# The default is /tmp/std_lib_analysis/results because, as of writing, that's always where std-analysis.sh outputs its results.
parser.add_argument('--analysis-results-dir',
type=str,
default="/tmp/std_lib_analysis/results",
help="Path to the folder file containing the std-analysis.sh results (default: /tmp/std_lib_analysis/results)")
args = parser.parse_args()

metrics = KaniSTDMetricsOverTime(args.metrics_file)
metrics.compute_metrics(args.kani_list_file, args.analysis_results_dir)
metrics.plot()

if __name__ == "__main__":
main()
17 changes: 17 additions & 0 deletions scripts/kani-std-analysis/metrics-data.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"results": [
{
"date": "2025-01-01",
"total_unsafe_fns": 6987,
"total_safe_abstractions": 1704,
"total_safe_fns": 14666,
"unsafe_fns_under_contract": 144,
"verified_unsafe_fns_under_contract": 132,
"safe_abstractions_under_contract": 41,
"verified_safe_abstractions_under_contract": 41,
"safe_fns_under_contract": 77,
"verified_safe_fns_under_contract": 77,
"total_functions_under_contract": 224
}
]
}
1 change: 1 addition & 0 deletions scripts/kani-std-analysis/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
matplotlib
Loading

0 comments on commit b4fa4fd

Please sign in to comment.