Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syz-cluster: initial code #5620

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

a-nogikh
Copy link
Collaborator

@a-nogikh a-nogikh commented Dec 17, 2024

The basic code of a k8s-based cluster that:

  • Aggregates new LKML patch series.
  • Determines the kernel trees to apply them to.
  • Builds the basic and the patched kernel.
  • Displays the results on a web dashboard.

This is a very rudimentary version with a lot of TODOs that provides a skeleton for further work.

The project makes use of Argo workflows and Spanner DB.
Bootstrap is used for the web interface.

Overall structure:

  • syz-cluster/dashboard: a web dashboard listing patch series and their test results.
  • syz-cluster/series-tracker: polls Lore archives and submits the new patch series to the DB.
  • syz-cluster/controller: schedules workflows and provides API for them.
  • syz-cluster/kernel-disk: a cron job that keeps a kernel checkout up to date.
  • syz-cluster/workflow/*: workflow steps.

For the DB structure see syz-cluster/pkg/db/migrations/*.


TODO (for this PR):

  • Go through the installation section, hopefully get rid of minio use.
  • Figure out why go.mod bumped to 1.23.1
  • Figure out how to run syz-cluster tests on GitHub CI (we'll need at least some local Spanner emulator binary).

Questions/thoughts out loud:
1. What to do with the vendor folder
Our vendor/ folder is quite big, and this PR adds even more modules on top of that. Do we still want to keep that code in our repostory? Is it possible to only keep the modules needed by other components but not syz-cluster (since I do go mod download in the Docker containers anyway).
Filed #5645

2. Some pkg/ packages are too eager to depend on prog/ and sys/
It has made some Dockerfiles more complicated that they could have been.
Should we strive to remove these dependencies?

@a-nogikh a-nogikh force-pushed the features/syz-cluster branch from dbf79ae to 0a8c2df Compare December 17, 2024 15:57
@a-nogikh a-nogikh changed the title Features/syz cluster syz-cluster: initial code Dec 17, 2024
Copy link
Collaborator

@dvyukov dvyukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first patch is not questionable. Let's review/merge it separately.

var patchSubjectRe = regexp.MustCompile(`\[(?:(?:rfc|resend)\s+)*patch`)
type PatchSubject struct {
Title string
Tags []string // Sometimes there's e.g. "net" or "next-next" in the subject.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tags are parsed, but lost later. What's the plan here?
Do we want to not test "RFC" patches? Then these should set Corrupted?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the comment, the developers sometimes explicitly say that the the patch it to be applied to the specific tree. This information is not used in this initial PR, but will be important for the further series triage implementations.

pkg/email/lore/parse.go Outdated Show resolved Hide resolved
pkg/email/lore/parse.go Outdated Show resolved Hide resolved
@@ -20,9 +22,29 @@ type Thread struct {
Messages []*email.Email
}

type Series struct {
Subject string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it happen that the subject changes between versions?
If so, do we need to also extract in-reply to message-id, which may be the previous version?

Copy link
Collaborator Author

@a-nogikh a-nogikh Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the series are usually not sent in-reply-to the previous ones, so I'd expect the title to be fairly stable. Otherwise there's no chance to relate them.

pkg/osutil/osutil.go Outdated Show resolved Hide resolved
pkg/vcs/git.go Outdated Show resolved Hide resolved
pkg/vcs/git.go Outdated Show resolved Hide resolved
pkg/vcs/git.go Outdated Show resolved Hide resolved
Dir string
Precious bool
Sandbox bool
Env []string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we unexport some of these fields if they are used only within this package?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've unexported Precious, the others are used outside (at least so far).

pkg/vcs/git.go Outdated Show resolved Hide resolved
Support filtering by the commit date.
Refactor the code to make it more reusable.
Add a method to extract specifically the list of new patch series.
@a-nogikh a-nogikh force-pushed the features/syz-cluster branch from 2a65760 to 770edae Compare December 23, 2024 11:37
It will let other parts of the system to use the git-specific
functionality.
@a-nogikh a-nogikh force-pushed the features/syz-cluster branch 2 times, most recently from ae3f62b to 0129842 Compare December 23, 2024 15:05
The basic code of a K8S-based cluster that:
* Aggregates new LKML patch series.
* Determines the kernel trees to apply them to.
* Builds the basic and the patched kernel.
* Displays the results on a web dashboard.

This is a very rudimentary version with a lot of TODOs that
provides a skeleton for further work.

The project makes use of Argo workflows and Spanner DB.
Bootstrap is used for the web interface.

Overall structure:
* syz-cluster/dashboard: a web dashboard listing patch series
  and their test results.
* syz-cluster/series-tracker: polls Lore archives and submits
  the new patch series to the DB.
* syz-cluster/controller: schedules workflows and provides API for them.
* syz-cluster/kernel-disk: a cron job that keeps a kernel checkout up to date.
* syz-cluster/workflow/*: workflow steps.

For the DB structure see syz-cluster/pkg/db/migrations/*.
@a-nogikh
Copy link
Collaborator Author

a-nogikh commented Dec 26, 2024

FTR:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants