[CI][E2E] Split gen12 pre-commit testing into build and run stages #16321

ayylol · 2024-12-10T17:05:01Z

This patch enables E2E build and run splitting on pre-commit CI. Currently one of the main bottlenecks in pre-commit CI is the Linux Gen12 testing which takes between 30 ~ 40 minutes to complete. By instead building these tests on our fast build systems, and then transferring the binaries to the Gen12 machine we can cut down on this time by roughly half.

Sadly, the Windows Gen12 testing will still be the main bottleneck so total time for pre-commit won't improve with this PR, but we will at least get faster feedback for gen12 on Linux.

These are mostly AOT tests

ayylol · 2024-12-11T19:04:15Z

sycl/test-e2e/format.py

Changes here are to work-around a hang that occurs when one of our tests try to run a binary that does not exist when inside one of the containers that we use for CI.

This hang is not a recent regression but rather an issue that has existed for a while but is very infrequently encountered with the current full testing mode. With the split testing if a test fails to build a binary in the build-only stage it will fail the run-only stage due to the binary not existing, making this hang a lot more common.

can we at least make a GH issue for the hang with repro steps?

Yep, here it is: #16351

.github/workflows/sycl-linux-precommit.yml

.github/workflows/sycl-linux-run-tests.yml

sycl/test-e2e/format.py

ayylol · 2024-12-20T16:13:36Z

@sarnex FYI, I had to change the container for the build-only stage from the ubuntu 22 alldeps container to the ubuntu 24 latest intel drivers container because some tests started to fail on it. I was using that container originally because it has the dependencies from the build container so that we can build the cuda/hip tests as well, but since those arent enabled yet for this pr its not really needed yet.

sarnex · 2024-12-20T16:17:46Z

Ok yeah looks like the problem is some tests link against libze_loader which is part of the GPU driver, and they recently dropped support for 22.04 (or at least don't provide packages for it) so I switched us to using 24.04, so libze_loader is using a newer glibc so it won't work on 22.04. Eventually we should make a 24.04 alldeps image but for now this is fine.

ayylol · 2024-12-20T21:24:58Z

@intel/llvm-gatekeepers This is ready to merge.
Failures in post commit gen12 for tests OnlineCompiler/online_compiler_OpenCL.cpp and USM/fill_any_size.cpp, are unrelated to this pr and appearing currently in post-commit on sycl branch (https://github.com/intel/llvm/actions/runs/12437990459/job/34729407798)

sarnex · 2024-12-20T21:51:08Z

@ayylol Can we/are we planning to do the same for postcommit?

sarnex · 2024-12-20T22:14:12Z

@ayylol
Also, seeing an XPASS during building the E2E tests, I don't know what an XPASS even means there but yeah

Unexpectedly Passed Tests (1):
  SYCL :: Matrix/SG32/get_coordinate_ops.cpp

https://github.com/intel/llvm/actions/runs/12438967580/job/34732118174

ayylol · 2024-12-23T02:30:47Z

@ayylol Also, seeing an XPASS during building the E2E tests, I don't know what an XPASS even means there but yeah
Unexpectedly Passed Tests (1):
  SYCL :: Matrix/SG32/get_coordinate_ops.cpp
https://github.com/intel/llvm/actions/runs/12438967580/job/34732118174

XPASS on the build-only stage usually means that its a test that is able to compile, but fails when running but the way it is marked the XFAIL is incorrectly triggered on the build-only. Opened #16455 to fix this.

@ayylol Can we/are we planning to do the same for postcommit?

Yep, adding this to the post-commit should be quite similar, likely wont require all the extra test markup I had to do in this pr. I'm thinking we should hold on it and see if the pre-commit changes aren't causing too much hassle before proceeding with that.

sarnex · 2024-12-23T15:37:49Z

Yep, adding this to the post-commit should be quite similar, likely wont require all the extra test markup I had to do in this pr. I'm thinking we should hold on it and see if the pre-commit changes aren't causing too much hassle before proceeding with that.

Sure, sounds good, thx!

aelovikov-intel · 2024-12-23T17:12:10Z

IMO, post-commit should happen after

We moved all pre-commit (except PVC e2e maybe)
We got back and cleaned up the logic for unsupported/xfail/etc so that it works nicely with the feature and doesn't require too much explicit markup.

To clarify: "REQURIES: pvc, build-and-run-mode" is bad, "REQUIRES-RUN: pvc" or "XFAIL-RUN: pvc" is much better. I'm not pushing for these specific spellings, but the whole "build-and-run-mode" was a temporary solution to make fast progress and rip the immediate benefits in the pre-commit.

ayylol had a problem deploying to WindowsCILock December 10, 2024 17:05 — with GitHub Actions Error

ayylol had a problem deploying to WindowsCILock December 10, 2024 17:54 — with GitHub Actions Error

ayylol had a problem deploying to WindowsCILock December 10, 2024 18:28 — with GitHub Actions Error

ayylol temporarily deployed to WindowsCILock December 10, 2024 18:28 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock December 10, 2024 20:04 — with GitHub Actions Inactive

ayylol had a problem deploying to WindowsCILock December 10, 2024 22:40 — with GitHub Actions Error

ayylol temporarily deployed to WindowsCILock December 10, 2024 22:40 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock December 10, 2024 23:43 — with GitHub Actions Inactive

ayylol had a problem deploying to WindowsCILock December 11, 2024 15:05 — with GitHub Actions Error

ayylol had a problem deploying to WindowsCILock December 11, 2024 15:06 — with GitHub Actions Error

ayylol temporarily deployed to WindowsCILock December 11, 2024 15:07 — with GitHub Actions Inactive

ayylol had a problem deploying to WindowsCILock December 11, 2024 15:08 — with GitHub Actions Error

ayylol temporarily deployed to WindowsCILock December 11, 2024 16:01 — with GitHub Actions Inactive

ayylol changed the title ~~[do not merge] Testing pre-commit ci splitting~~ [CI][E2E] Split gen12 pre-commit testing into build and run stages Dec 11, 2024

ayylol mentioned this pull request Dec 11, 2024

[SYCL][E2E] Add option to build tests on run-only mode if marked as REQUIRES: build-and-run-mode #16306

Merged

ayylol added 5 commits December 11, 2024 10:49

Add split testing functionality to CI and enable in precommit

ad165c5

Add workaround for lit hang inside our containers

d604088

Mark as split exceptions tests newly failing on build stage

0f6313b

Mark as XFAIL: run-mode test that can pass build stage

142e661

Add as split exceptions tests that fail on run-only stage

ba3dd1a

These are mostly AOT tests

ayylol force-pushed the e2e-split-ci-testing branch from dcf422e to ba3dd1a Compare December 11, 2024 18:57

ayylol temporarily deployed to WindowsCILock December 11, 2024 18:57 — with GitHub Actions Inactive

ayylol commented Dec 11, 2024

View reviewed changes

ayylol temporarily deployed to WindowsCILock December 11, 2024 19:33 — with GitHub Actions Inactive

aelovikov-intel reviewed Dec 11, 2024

View reviewed changes

ayylol requested review from sarnex and uditagarwal97 December 11, 2024 20:02

AllanZyne approved these changes Dec 20, 2024

View reviewed changes

Merge branch 'sycl' into e2e-split-ci-testing

1d12eac

ayylol had a problem deploying to WindowsCILock December 20, 2024 14:14 — with GitHub Actions Error

ayylol temporarily deployed to WindowsCILock December 20, 2024 14:15 — with GitHub Actions Inactive

ayylol had a problem deploying to WindowsCILock December 20, 2024 14:29 — with GitHub Actions Error

ayylol added 2 commits December 20, 2024 06:54

Merge branch 'sycl' into e2e-split-ci-testing

6ee8620

Use ubuntu 24 latest drivers container for build

776e5f0

ayylol had a problem deploying to WindowsCILock December 20, 2024 15:00 — with GitHub Actions Error

ayylol had a problem deploying to WindowsCILock December 20, 2024 15:01 — with GitHub Actions Error

Merge branch 'sycl' into e2e-split-ci-testing

35f8e08

ayylol had a problem deploying to WindowsCILock December 20, 2024 15:34 — with GitHub Actions Error

ayylol temporarily deployed to WindowsCILock December 20, 2024 15:35 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock December 20, 2024 15:58 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock December 20, 2024 16:07 — with GitHub Actions Inactive

ayylol had a problem deploying to WindowsCILock December 20, 2024 16:55 — with GitHub Actions Failure

Merge branch 'sycl' into e2e-split-ci-testing

c9eb5de

ayylol temporarily deployed to WindowsCILock December 20, 2024 19:34 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock December 20, 2024 20:15 — with GitHub Actions Inactive

ayylol temporarily deployed to WindowsCILock December 20, 2024 20:27 — with GitHub Actions Inactive

sarnex merged commit cc4dee4 into sycl Dec 20, 2024
23 of 24 checks passed

ayylol deleted the e2e-split-ci-testing branch December 20, 2024 21:28

[CI][E2E] Split gen12 pre-commit testing into build and run stages #16321

[CI][E2E] Split gen12 pre-commit testing into build and run stages #16321

Uh oh!

Conversation

ayylol commented Dec 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayylol Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

sarnex Dec 11, 2024

Choose a reason for hiding this comment

Uh oh!

ayylol Dec 12, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ayylol commented Dec 20, 2024

Uh oh!

sarnex commented Dec 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayylol commented Dec 20, 2024

Uh oh!

Uh oh!

sarnex commented Dec 20, 2024

Uh oh!

sarnex commented Dec 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayylol commented Dec 23, 2024

Uh oh!

sarnex commented Dec 23, 2024

Uh oh!

aelovikov-intel commented Dec 23, 2024

Uh oh!

Uh oh!

ayylol commented Dec 10, 2024 •

edited

Loading

sarnex commented Dec 20, 2024 •

edited

Loading

sarnex commented Dec 20, 2024 •

edited

Loading