You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[CI] Tune nightly benchmarking job for better reliability (#17122)
This PR tunes the nightly benchmarking job to produce more consistent
results:
- Lowers the tolerance threshold of benchmarking results accepted from
50% to 8%
- Nightly was flaking before even with a 50% tolerance threshold
- Raises the iterations to 5000
- Using 10,000 iterations did not result in significantly more stable
performance, although this may change as we obtain more data
- However, the PVC benchmarking job in the overall nightly workflow now
takes about ~47 minutes, whereas before the PVC benchmarking job took
~14 minutes
- This should not have major impact on execution time however,
considering the E2E tests take ~42 minutes: Since both these jobs run in
parallel on different machines, the theoretical effect on the overall
workflow should only be about 5 minutes, although this would depend on
whether or not machines are able to be scheduled in time.
- Changes the benchmarking workflows in sycl-nightly.yml to use the
tuned PERF_PVC runner
- Untuned machines are exhibiting large variations when running
compute-benchmarks (20-25%, up to 50% in the worst case scenario): These
are unacceptable variations and not particularly useful.
- Disables nightly benchmarking on gen12:
- Gen12 machines are currently untuned. Similar to PVC machines, these
results are not accurate and not worth serious nightly benchmarking.
- Adds guards for benchmarking jobs to prevent benchmark runs in forks
#14454 (comment)
---------
Co-authored-by: Nick Sarnie <[email protected]>
0 commit comments