Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve master_commit_red query performance #6174

Merged
merged 6 commits into from
Jan 22, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 38 additions & 19 deletions torchci/clickhouse_queries/master_commit_red/query.sql
Original file line number Diff line number Diff line change
@@ -1,38 +1,57 @@
--- This query is used to show the histogram of trunk red commits on HUD metrics page
--- during a period of time
WITH all_jobs AS (
-- Split up the query into multiple CTEs to make it faster.
with commits as (
select
push.head_commit.'timestamp' as time,
push.head_commit.'id' as sha
from
-- Not using final since push table doesn't really get updated
push
where
push.ref in ('refs/heads/master', 'refs/heads/main')
and push.repository.'owner'.'name' = 'pytorch'
and push.repository.'name' = 'pytorch'
and push.head_commit.'timestamp' >= {startTime: DateTime64(3)}
and push.head_commit.'timestamp' < {stopTime: DateTime64(3)}
),
all_runs as (
select
workflow_run.id as id,
workflow_run.head_commit.'id' as sha,
workflow_run.name as name,
commit.time as time
from
workflow_run final
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another optimization opportunity:

Do we really need to know when a job is pending? If not, instead of FINAL we can filter on a terminal conclusion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave it as is because I don't want to change the results from the old query

join commits commit on workflow_run.head_commit.'id' = commit.sha
where
(
-- Limit it to workflows which block viable/strict upgrades
workflow_run.name in ('Lint', 'pull', 'trunk')
OR workflow_run.name like 'linux-binary%'
)
AND workflow_run.event != 'workflow_run' -- Filter out workflow_run-triggered jobs, which have nothing to do with the SHA
and workflow_run.id in (select id from materialized_views.workflow_run_by_head_sha where head_sha in (select sha from commits))
),
all_jobs AS (
SELECT
push.head_commit.'timestamp' AS time,
all_runs.time AS time,
CASE
WHEN job.conclusion = 'failure' THEN 'red'
WHEN job.conclusion = 'timed_out' THEN 'red'
WHEN job.conclusion = 'cancelled' THEN 'red'
WHEN job.conclusion = '' THEN 'pending'
ELSE 'green'
END as conclusion,
push.head_commit.'id' AS sha
all_runs.sha AS sha
FROM
workflow_job job FINAL
JOIN workflow_run FINAL ON workflow_run.id = workflow_job.run_id
JOIN push FINAL ON workflow_run.head_commit.'id' = push.head_commit.'id'
default.workflow_job job final join all_runs all_runs on all_runs.id = workflow_job.run_id
WHERE
job.name != 'ciflow_should_run'
AND job.name != 'generate-test-matrix'
AND (
-- Limit it to workflows which block viable/strict upgrades
workflow_run.name in ('Lint', 'pull', 'trunk')
OR workflow_run.name like 'linux-binary%'
)
AND job.name NOT LIKE '%rerun_disabled_tests%'
AND job.name NOT LIKE '%unstable%'
AND workflow_run.event != 'workflow_run' -- Filter out workflow_run-triggered jobs, which have nothing to do with the SHA
AND push.ref IN (
'refs/heads/master', 'refs/heads/main'
)
AND push.repository.'owner'.'name' = 'pytorch'
AND push.repository.'name' = 'pytorch'
AND push.head_commit.'timestamp' >= {startTime: DateTime64(3)}
AND push.head_commit.'timestamp' < {stopTime: DateTime64(3)}
and job.id in (select id from materialized_views.workflow_job_by_head_sha where head_sha in (select sha from commits))
),
commit_overall_conclusion AS (
SELECT
Expand Down
Loading