Skip to content

Commit 2e03129

Browse files
committed
chore: try cachegrind for reproducibility
1 parent d1831d7 commit 2e03129

File tree

3 files changed

+161
-3
lines changed

3 files changed

+161
-3
lines changed

Diff for: .github/workflows/test.yml

+6-3
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,11 @@ jobs:
2424
yarn
2525
yarn test
2626
27+
- name: install valgrind
28+
run: sudo apt-get install -y valgrind
29+
2730
- name: benchmark
28-
run: yarn benchmark | tee output.txt
31+
run: python cachegrind.py node test/benchmark2.js > output.txt
2932

3033
- name: Download previous benchmark result
3134
uses: actions/cache@v1
@@ -36,10 +39,10 @@ jobs:
3639
- name: Store benchmark result
3740
uses: benchmark-action/github-action-benchmark@v1
3841
with:
39-
tool: 'benchmarkjs'
42+
tool: 'customSmallerIsBetter'
4043
output-file-path: output.txt
4144
external-data-json-path: ./cache/benchmark-data.json
42-
alert-threshold: '200%'
45+
alert-threshold: '110%'
4346
fail-on-alert: true
4447
env:
4548
CI: true

Diff for: cachegrind.py

+137
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
"""
2+
Proof-of-concept: run_with_cachegrind a program under Cachegrind, combining all the various
3+
metrics into one single performance metric.
4+
5+
Requires Python 3.
6+
7+
License: https://opensource.org/licenses/MIT
8+
9+
## Features
10+
11+
* Disables ASLR.
12+
* Sets consistent cache sizes.
13+
* Calculates a combined performance metric.
14+
15+
For more information see the detailed write up at:
16+
17+
https://pythonspeed.com/articles/consistent-benchmarking-in-ci/
18+
19+
## Usage
20+
21+
This script has no compatibility guarnatees, I recommend copying it into your
22+
repository. To use:
23+
24+
$ python3 cachegrind.py ./yourprogram --yourparam=yourvalues
25+
26+
If you're benchmarking Python, make sure to set PYTHONHASHSEED to a fixed value
27+
(e.g. `export PYTHONHASHSEED=1234`). Other languages may have similar
28+
requirements to reduce variability.
29+
30+
The last line printed will be a combined performance metric, but you can tweak
31+
the script to extract more info, or use it as a library.
32+
33+
Copyright © 2020, Hyphenated Enterprises LLC.
34+
"""
35+
36+
import json
37+
from typing import List, Dict
38+
from subprocess import check_call, check_output
39+
import sys
40+
from tempfile import NamedTemporaryFile
41+
42+
ARCH = check_output(["uname", "-m"]).strip()
43+
44+
45+
def run_with_cachegrind(args_list: List[str]) -> Dict[str, int]:
46+
"""
47+
Run the the given program and arguments under Cachegrind, parse the
48+
Cachegrind specs.
49+
50+
For now we just ignore program output, and in general this is not robust.
51+
"""
52+
temp_file = NamedTemporaryFile("r+")
53+
check_call([
54+
# Disable ASLR:
55+
"setarch",
56+
ARCH,
57+
"-R",
58+
"valgrind",
59+
"--tool=cachegrind",
60+
# Set some reasonable L1 and LL values, based on Haswell. You can set
61+
# your own, important part is that they are consistent across runs,
62+
# instead of the default of copying from the current machine.
63+
"--I1=32768,8,64",
64+
"--D1=32768,8,64",
65+
"--LL=8388608,16,64",
66+
"--cachegrind-out-file=" + temp_file.name,
67+
] + args_list)
68+
return parse_cachegrind_output(temp_file)
69+
70+
71+
def parse_cachegrind_output(temp_file):
72+
# Parse the output file:
73+
lines = iter(temp_file)
74+
for line in lines:
75+
if line.startswith("events: "):
76+
header = line[len("events: "):].strip()
77+
break
78+
for line in lines:
79+
last_line = line
80+
assert last_line.startswith("summary: ")
81+
last_line = last_line[len("summary:"):].strip()
82+
return dict(zip(header.split(), [int(i) for i in last_line.split()]))
83+
84+
85+
def get_counts(cg_results: Dict[str, int]) -> Dict[str, int]:
86+
"""
87+
Given the result of run_with_cachegrind(), figure out the parameters we will use for final
88+
estimate.
89+
90+
We pretend there's no L2 since Cachegrind doesn't currently support it.
91+
92+
Caveats: we're not including time to process instructions, only time to
93+
access instruction cache(s), so we're assuming time to fetch and run_with_cachegrind
94+
instruction is the same as time to retrieve data if they're both to L1
95+
cache.
96+
"""
97+
result = {}
98+
d = cg_results
99+
100+
ram_hits = d["DLmr"] + d["DLmw"] + d["ILmr"]
101+
102+
l3_hits = d["I1mr"] + d["D1mw"] + d["D1mr"] - ram_hits
103+
104+
total_memory_rw = d["Ir"] + d["Dr"] + d["Dw"]
105+
l1_hits = total_memory_rw - l3_hits - ram_hits
106+
assert total_memory_rw == l1_hits + l3_hits + ram_hits
107+
108+
result["l1"] = l1_hits
109+
result["l3"] = l3_hits
110+
result["ram"] = ram_hits
111+
112+
return result
113+
114+
115+
def combined_instruction_estimate(counts: Dict[str, int]) -> int:
116+
"""
117+
Given the result of run_with_cachegrind(), return estimate of total time to run_with_cachegrind.
118+
119+
Multipliers were determined empirically, but some research suggests they're
120+
a reasonable approximation for cache time ratios. L3 is probably too low,
121+
but then we're not simulating L2...
122+
"""
123+
return counts["l1"] + (5 * counts["l3"]) + (35 * counts["ram"])
124+
125+
126+
def github_action_benchmark_json(value):
127+
return json.dumps([
128+
{
129+
"name": "score",
130+
"unit": "",
131+
"value": value,
132+
}
133+
])
134+
135+
136+
if __name__ == "__main__":
137+
print(github_action_benchmark_json(combined_instruction_estimate(get_counts(run_with_cachegrind(sys.argv[1:])))))

Diff for: test/benchmark2.js

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
const parser = require("./main");
2+
3+
const code = `
4+
<?php
5+
class foo extends bar {
6+
const A = 1, B = 2, C = 3;
7+
protected $foo = null;
8+
public function __construct($a, $b, $c, array $d = []) {
9+
echo $a . $b . $c . implode(";", $d);
10+
}
11+
static public function bar(): foo {
12+
return new self(1, 2, 3);
13+
}
14+
}
15+
`;
16+
17+
const reader = new parser();
18+
reader.parseCode(code);

0 commit comments

Comments
 (0)