SciTools · pp-mo · Apr 25, 2025 · Apr 25, 2025 · Apr 25, 2025
diff --git a/.github/workflows/benchmarks_report.yml b/.github/workflows/benchmarks_report.yml
@@ -80,4 +80,4 @@ jobs:
       - name: Post reports
         env:
           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-        run: python benchmarks/bm_runner.py _gh_post
+        run: benchmarks/bm_runner.py _gh_post
diff --git a/.github/workflows/benchmarks_run.yml b/.github/workflows/benchmarks_run.yml
@@ -21,6 +21,8 @@ on:
 
 jobs:
   pre-checks:
+    # This workflow supports two different scenarios (overnight and branch).
+    #  The pre-checks job determines which scenario is being run.
     runs-on: ubuntu-latest
     if: github.repository == 'SciTools/iris'
     outputs:
@@ -36,9 +38,11 @@ jobs:
           # SEE ALSO .github/labeler.yml .
           paths: requirements/locks/*.lock setup.py
       - id: overnight
+        name: Check overnight scenario
         if: github.event_name != 'pull_request'
         run: echo "check=true" >> "$GITHUB_OUTPUT"
       - id: branch
+        name: Check branch scenario
         if: >
           github.event_name == 'pull_request'
           &&
@@ -67,7 +71,8 @@ jobs:
 
     steps:
       # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
-      - uses: actions/checkout@v4
+      - name: Checkout repo
+        uses: actions/checkout@v4
         with:
           fetch-depth: 0
 
@@ -107,6 +112,8 @@ jobs:
           echo "OVERRIDE_TEST_DATA_REPOSITORY=${GITHUB_WORKSPACE}/${IRIS_TEST_DATA_PATH}/test_data" >> $GITHUB_ENV
 
       - name: Benchmark this pull request
+        # If the 'branch' condition(s) are met: use the bm_runner to compare
+        #  the proposed merge with the base branch.
         if: needs.pre-checks.outputs.branch == 'true'
         env:
           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
@@ -115,10 +122,14 @@ jobs:
           nox -s benchmarks -- branch origin/${{ github.base_ref }}
 
       - name: Run overnight benchmarks
+        # If the 'overnight' condition(s) are met: use the bm_runner to compare
+        #  each of the last 24 hours' commits to their parents.
         id: overnight
         if: needs.pre-checks.outputs.overnight == 'true'
         env:
           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        # The first_commit argument allows a custom starting point - useful
+        #  for manual re-running.
         run: |
           first_commit=${{ inputs.first_commit }}
           if [ "$first_commit" == "" ]
@@ -132,6 +143,8 @@ jobs:
           fi
 
       - name: Warn of failure
+        # The overnight run is not on a pull request, so a failure could go
+        #  unnoticed without being actively advertised.
         if: >
           failure() &&
           steps.overnight.outcome == 'failure'
@@ -143,13 +156,15 @@ jobs:
           gh issue create --title "$title" --body "$body" --label "Bot" --label "Type: Performance" --repo $GITHUB_REPOSITORY
 
       - name: Upload any benchmark reports
+        # Uploading enables more downstream processing e.g. posting a PR comment.
         if: success() || steps.overnight.outcome == 'failure'
         uses: actions/upload-artifact@v4
         with:
           name: benchmark_reports
           path: .github/workflows/benchmark_reports
 
       - name: Archive asv results
+        # Store the raw ASV database(s) to help manual investigations.
         if: ${{ always() }}
         uses: actions/upload-artifact@v4
         with:

diff --git a/benchmarks/README.md b/benchmarks/README.md
@@ -1,6 +1,6 @@
-# Iris Performance Benchmarking
+# SciTools Performance Benchmarking
 
-Iris uses an [Airspeed Velocity](https://github.com/airspeed-velocity/asv)
+SciTools uses an [Airspeed Velocity](https://github.com/airspeed-velocity/asv)
 (ASV) setup to benchmark performance. This is primarily designed to check for
 performance shifts between commits using statistical analysis, but can also
 be easily repurposed for manual comparative and scalability analyses.
@@ -21,25 +21,30 @@ by the PR. (This run is managed by
 [the aforementioned GitHub Action](../.github/workflows/benchmark.yml)).
 
 To run locally: the **benchmark runner** provides conveniences for
-common benchmark setup and run tasks, including replicating the automated 
-overnight run locally. This is accessed via the Nox `benchmarks` session - see
-`nox -s benchmarks -- --help` for detail (_see also: 
-[bm_runner.py](./bm_runner.py)_). Alternatively you can directly run `asv ...`
-commands from this directory (you will still need Nox installed - see
-[Benchmark environments](#benchmark-environments)).
+common benchmark setup and run tasks, including replicating the benchmarking
+performed by GitHub Actions workflows. This can be accessed by:
+
+- The Nox `benchmarks` session - (use
+  `nox -s benchmarks -- --help` for details).
+- `benchmarks/bm_runner.py` (use the `--help` argument for details).
+- Directly running `asv` commands from the `benchmarks/` directory (check
+  whether environment setup has any extra dependencies - see 
+  [Benchmark environments](#benchmark-environments)).
+
+### Reducing run time
 
 A significant portion of benchmark run time is environment management. Run-time
-can be reduced by placing the benchmark environment on the same file system as
-your
-[Conda package cache](https://conda.io/projects/conda/en/latest/user-guide/configuration/use-condarc.html#specify-pkg-directories),
-if it is not already. You can achieve this by either:
-
-- Temporarily reconfiguring `ENV_PARENT` in `delegated_env_commands` 
-  in [asv.conf.json](asv.conf.json) to reference a location on the same file
-  system as the Conda package cache.
+can be reduced by co-locating the benchmark environment and your 
+[Conda package cache](https://docs.conda.io/projects/conda/en/latest/user-guide/configuration/custom-env-and-pkg-locations.html) 
+on the same [file system](https://en.wikipedia.org/wiki/File_system), if they 
+are not already. This can be done in several ways:
+
+- Temporarily reconfiguring `env_parent` in
+  [`_asv_delegated_abc`](_asv_delegated_abc.py) to reference a location on the same 
+  file system as the Conda package cache.
 - Using an alternative Conda package cache location during the benchmark run,
   e.g. via the `$CONDA_PKGS_DIRS` environment variable.
-- Moving your Iris repo to the same file system as the Conda package cache.
+- Moving your repo checkout to the same file system as the Conda package cache.
 
 ### Environment variables
 
@@ -73,7 +78,8 @@ requirements will not be delayed by repeated environment setup - especially
 relevant given the [benchmark runner](bm_runner.py)'s use of
 [--interleave-rounds](https://asv.readthedocs.io/en/stable/commands.html?highlight=interleave-rounds#asv-run),
 or any time you know you will repeatedly benchmark the same commit. **NOTE:**
-Iris environments are large so this option can consume a lot of disk space.
+SciTools environments tend to large so this option can consume a lot of disk 
+space.
 
 ## Writing benchmarks
 
@@ -97,6 +103,7 @@ for manual investigations; and consider committing any useful benchmarks as
 [on-demand benchmarks](#on-demand-benchmarks) for future developers to use.
 
 ### Data generation
+
 **Important:** be sure not to use the benchmarking environment to generate any
 test objects/files, as this environment changes with each commit being
 benchmarked, creating inconsistent benchmark 'conditions'. The
@@ -106,7 +113,7 @@ solution; read more detail there.
 ### ASV re-run behaviour
 
 Note that ASV re-runs a benchmark multiple times between its `setup()` routine.
-This is a problem for benchmarking certain Iris operations such as data
+This is a problem for benchmarking certain SciTools operations such as data
 realisation, since the data will no longer be lazy after the first run.
 Consider writing extra steps to restore objects' original state _within_ the
 benchmark itself.
@@ -117,10 +124,13 @@ maintain result accuracy this should be accompanied by increasing the number of
 repeats _between_ `setup()` calls using the `repeat` attribute.
 `warmup_time = 0` is also advisable since ASV performs independent re-runs to
 estimate run-time, and these will still be subject to the original problem.
+The `@disable_repeat_between_setup` decorator in 
+[`benchmarks/__init__.py`](benchmarks/__init__.py) offers a convenience for 
+all this.
 
 ### Custom benchmarks
 
-Iris benchmarking implements custom benchmark types, such as a `tracemalloc`
+SciTools benchmarking implements custom benchmark types, such as a `tracemalloc`
 benchmark to measure memory growth. See [custom_bms/](./custom_bms) for more
 detail.
 
@@ -131,10 +141,10 @@ limited available runtime and risk of false-positives. It remains useful for
 manual investigations).**
 
 When comparing performance between commits/file-type/whatever it can be helpful
-to know if the differences exist in scaling or non-scaling parts of the Iris
-functionality in question. This can be done using a size parameter, setting
-one value to be as small as possible (e.g. a scalar `Cube`), and the other to
-be significantly larger (e.g. a 1000x1000 `Cube`). Performance differences
+to know if the differences exist in scaling or non-scaling parts of the 
+operation under test. This can be done using a size parameter, setting
+one value to be as small as possible (e.g. a scalar value), and the other to
+be significantly larger (e.g. a 1000x1000 array). Performance differences
 might only be seen for the larger value, or the smaller, or both, getting you
 closer to the root cause.
 
@@ -151,13 +161,15 @@ suites for the UK Met Office NG-VAT project.
 ## Benchmark environments
 
 We have disabled ASV's standard environment management, instead using an
-environment built using the same Nox scripts as Iris' test environments. This
-is done using ASV's plugin architecture - see
-[asv_delegated_conda.py](asv_delegated_conda.py) and the extra config items in
-[asv.conf.json](asv.conf.json).
+environment built using the same scripts that set up the package test 
+environments. 
+This is done using ASV's plugin architecture - see
+[`asv_delegated.py`](asv_delegated.py) and associated 
+references in [`asv.conf.json`](asv.conf.json) (`environment_type` and 
+`plugins`).
 
 (ASV is written to control the environment(s) that benchmarks are run in -
 minimising external factors and also allowing it to compare between a matrix
 of dependencies (each in a separate environment). We have chosen to sacrifice
 these features in favour of testing each commit with its intended dependencies,
-controlled by Nox + lock-files).
+controlled by the test environment setup script(s)).