-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEAB-6097: Nightly CLI build failure should notify in Slack #284
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #284 +/- ##
=============================================
- Coverage 70.34% 70.09% -0.25%
+ Complexity 1077 1058 -19
=============================================
Files 47 47
Lines 6069 6069
Branches 801 801
=============================================
- Hits 4269 4254 -15
- Misses 1466 1469 +3
- Partials 334 346 +12
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
.circleci/config.yml
Outdated
- when: | ||
condition: << pipeline.parameters.run_against_develop_core >> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggesting moving the when
condition into notify-slack
so you don't have to specify it multiple times
.circleci/config.yml
Outdated
@@ -43,17 +44,23 @@ common_jobs: &common_jobs | |||
<<: *common_filters | |||
requires: | |||
- find-develop-snapshot | |||
context: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the CLI nightly workflow consists of multiple jobs, the PR currently proposes separate Slack notifications for the build, unit tests, and integration tests. Are there any jobs that should be added/removed?
GitHub won't let me comment on the lines I want, but is it possible to call notify-slack
within common_jobs
, after the build
job? So the slack notification will alert us if the nightly
workflow fails as a whole and we don't have to add the notify-slack
command to multiple jobs
dockstore-cli/.circleci/config.yml
Lines 38 to 43 in 9cb046a
common_jobs: &common_jobs | |
jobs: | |
# find-develop-snapshot job is run for both "everything" and "nightly" builds even though it's only needed for nightly. | |
# See https://github.com/dockstore/dockstore-cli/pull/268#discussion_r1419429151 | |
- find-develop-snapshot | |
- build: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I've seen, I don't think the slack/notify
command can assess the cumulative status of multiple jobs in a workflow (see "Is it possible to receive only the final status of the Workflow?" in https://github.com/CircleCI-Public/slack-orb/wiki/FAQ).
Seeing if we can integrate Slack via webhook instead of the orb -- it looks like CircleCI has a workflow-completed
event (https://circleci.com/docs/webhooks/#event-specifications) which could satisfy the suggested feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, I wasn't sure if it was possible so you can ignore that feedback 🙂
Is it possible to add notify-slack
to setup_and_run_integration_tests
? It's the only command called by non-confidential-tests
, confidential-workflow-tests
, confidential-tool-tests
, and integration-tests
. That way you don't have to add it to each integration test job
dockstore-cli/.circleci/config.yml
Lines 550 to 557 in 9cb046a
setup_and_run_integration_tests: | |
steps: | |
- setup_for_integration_tests | |
- restore_bash_env | |
- run_tests | |
- save_test_results | |
- send_coverage | |
- persist_coverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would work, since notify-slack
would be a step within a job 👍 It would prevent repetition in the config file, but I guess we'd still have separate notifications for each job (as opposed to one workflow notification).
Will give it a try first, so we can decide how we feel about the multiple notifications
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem a bit excessive to report with separate Slack notifications for the build, unit tests, and integration tests.
Could consider notifying after sonar cloud which should only pass if all the other builds pass. (I don't feel strongly about this one, but the number of notifications we get can be somewhat overwhelming)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible concern with only notifying in sonar cloud (if I'm understanding CircleCI correctly): if the build fails, the unit/integration tests (which require the build) won't run at all, and then sonar cloud (which requires those tests) won't run also? So we won't be notified at all.
To be fair, if we're only notifying on failures, the notifications shouldn't be a daily occurrence anyways
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. So you would be able to notify on success, but not on failure. 💭
To be fair, if we're only notifying on failures, the notifications shouldn't be a daily occurrence anyways
Ah good point, if this PR is notifying only on failure that would be a massive improvement in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed with Kathy's feedback, nothing else to add. Please re-request review when you've responded to it, thanks.
- notifies slack for build, unit tests, and all integration tests (separately)
@@ -536,6 +553,7 @@ commands: | |||
- save_test_results | |||
- send_coverage | |||
- persist_coverage | |||
- notify-slack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing purposes, this is what the Slack notifications looked like when notified separately on each job's success: https://oicr.slack.com/archives/C0151UDNR9R/p1712088975956109
Things to consider: (re-requesting reviews for discussion on this)
- If we choose to notify Slack only on failed jobs, these messages probably won't occur too frequently; nightly runs will probably succeed most of the time
- I don't think it'd be possible to get failed notifications from the build, unit tests, and integration tests all at once. If the build fails/notifies, for example, the dependent jobs (i.e. all the tests) should be skipped and can't notify. Worst case scenario is if the build passes, and all the unit/integration tests fail?
- Also just noticed that some integration tests produce multiple notifications (e.g. 3 for confidential-tool/workflow-tests, 2 for non-confidential-tests); appears to be due to the parallelism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- If we choose to notify Slack only on failed jobs, these messages probably won't occur too frequently; nightly runs will probably succeed most of the time
I think notifying on failed jobs is better since those are the actionable ones. That said, we can run into the problem that it the notifications break ...
Especially with the parallel notifications,
some integration tests produce multiple notifications
I think there are too many notifications.
Worst case scenario is if the build passes, and all the unit/integration tests fail?
If I understand correctly, we'd get messages for all the failures if we were notifying on failures only.
I think that's ok. I think numerous failure messages in a worse case scenario is better than numerous success messages (which may lead us to ignore messages)
Open to ideas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. So you would be able to notify on success, but not on failure. 💭
To be fair, if we're only notifying on failures, the notifications shouldn't be a daily occurrence anyways
Ah good point, if this PR is notifying only on failure that would be a massive improvement in the first place.
Agreed it would be very nice if we only got one failure notification, but I don't see a way to do it.
.circleci/config.yml
Outdated
@@ -6,9 +6,13 @@ parameters: | |||
run_against_develop_core: | |||
type: boolean | |||
default: false | |||
test_slack: # added temporarily for testing purposes | |||
type: boolean | |||
default: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be here still?
.circleci/config.yml
Outdated
channel: $slack_id | ||
event: fail | ||
template: basic_fail_1 | ||
- slack/notify: # added temporarily for testing purposes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment, should this be here?
Quality Gate passedIssues Measures |
Description
This PR restores the Slack notification for nightly CLI build failures, which was removed after moving the nightly cron to CircleCI (#268). Notifications are set up appear in #dockstore-dev-alerts.
Review Instructions
In config.yml, verify that the Slack notification is configured as expected. See https://oicr.slack.com/archives/C0151UDNR9R/p1711479933594839 for a successful test run of the
slack/notify
setup (without the nightly condition).Since the CLI nightly workflow consists of multiple jobs, the PR currently proposes separate Slack notifications for the build, unit tests, and integration tests. Are there any jobs that should be added/removed?
Note: I originally created a new Slack app called "Dockstore CLI Build", with its own CircleCI context called
oicr-slack-cli
, to do the notifying (test run: https://oicr.slack.com/archives/C0151UDNR9R/p1711465118013749). But it looks like the notification also works with our pre-existing "dockstore build notifications" app andoicr-slack
context. I decided to stick with the old app for consistency, but do comment if you'd prefer otherwise.Issue
SEAB-6097
Security
If there are any concerns that require extra attention from the security team, highlight them here.
Please make sure that you've checked the following before submitting your pull request. Thanks!
./mvnw clean install