-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds chained alerts #976
Adds chained alerts #976
Conversation
e8d1cea
to
769f8a5
Compare
} catch (ex: Exception) { | ||
logger.error("Error executing workflow delegate. Error: ${ex.message}", ex) | ||
logger.error("Error executing workflow delegate monitor ${delegate.monitorId}", ex) | ||
lastErrorDelegateRun = AlertingException.wrap(ex) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be break not continue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
alerting/build.gradle
Outdated
// Only apply jacoco test coverage if we are running a local single node cluster | ||
if (!usingRemoteCluster && !usingMultiNode) { | ||
apply from: '../build-tools/opensearchplugin-coverage.gradle' | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we removing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted
@@ -122,7 +125,9 @@ class AlertService( | |||
fun composeQueryLevelAlert( | |||
ctx: QueryLevelTriggerExecutionContext, | |||
result: QueryLevelTriggerRunResult, | |||
alertError: AlertError? | |||
alertError: AlertError?, | |||
executionId: String? = null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we want the executionId to be optional? I remember conversations about this being required to help with auditing in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
made it mandatory in monitors.
@@ -218,32 +224,31 @@ object MonitorRunnerService : JobRunner, CoroutineScope, AbstractLifecycleCompon | |||
} | |||
|
|||
override fun runJob(job: ScheduledJob, periodStart: Instant, periodEnd: Instant) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add comments here on why we can only have 1 job runner and not a separate one for workflows? And we can talk about if the job scheduler plugin takes care of this, we can go back to having 2 job runners.
|
||
// Updating the scheduled job index at the start of monitor execution runs for when there is an upgrade the the schema mapping | ||
// has not been updated. | ||
if (!IndexUtils.scheduledJobIndexUpdated && monitorCtx.clusterService != null && monitorCtx.client != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this to make sure the config index is updated before monitors/workflows execute to prevent mapping problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted. thanks.
// if the script fails we need to send an alert so set triggered = true | ||
ChainedAlertTriggerRunResult( | ||
triggerName = trigger.name, | ||
triggered = false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says to set it to true, but its false here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -509,7 +563,8 @@ class AlertService( | |||
dataSources: DataSources, | |||
alerts: List<Alert>, | |||
retryPolicy: BackoffPolicy, | |||
allowUpdatingAcknowledgedAlert: Boolean = false | |||
allowUpdatingAcknowledgedAlert: Boolean = false, | |||
routing: String? = null // routing is mandatory and set as monitor id. for workflow chained alerts we pass workflow id as routing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this optional if its mandatory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made routing mandatory
@@ -580,9 +599,12 @@ class AlertService( | |||
} | |||
} | |||
Alert.State.AUDIT -> { | |||
val index = if (alertIndices.isAlertHistoryEnabled()) { | |||
dataSources.alertsHistoryIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's create an issue for investigating how to handle migration of data when alert history setting is flipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created issue.
a3be79c
to
a543356
Compare
return Alert( | ||
startTime = Instant.now(), | ||
lastNotificationTime = Instant.now(), | ||
state = Alert.State.ACTIVE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the previous alert was active for the chained alert, do we ignore that and not extend the existing alert and add more info there like query level and bucket level monitors? This logic is more similar to doc level, which would make less sense for chained alerts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will discuss this more with folks and fix this logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the underlying motivation was that the chained alert can be created for that particular execution and be associated with delegate monitor alerts from that execution. will have to see if extending an alert fits this thinking.
|
||
return RestChannelConsumer { channel -> | ||
val dryrun = request.paramAsBoolean("dryrun", false) | ||
val requestEnd = request.paramAsTime("period_end", TimeValue(Instant.now().toEpochMilli())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there another param for this called, period_end
? If so, what is it exactly for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 62 to 63 in 0d778b8
val (periodStart, periodEnd) = | |
workflow.schedule.getPeriodEndingAt(Instant.ofEpochMilli(execWorkflowRequest.requestEnd.millis)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
import org.opensearch.rest.action.RestToXContentListener | ||
|
||
/** | ||
* This class consists of the REST handler to retrieve alerts . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this include chained alerts as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. workflow alerts are only chained alerts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updating java doc
val queryBuilder = QueryBuilders.nestedQuery( | ||
TransportDeleteWorkflowAction.WORKFLOW_DELEGATE_PATH, | ||
QueryBuilders.boolQuery().must( | ||
QueryBuilders.matchQuery( | ||
TransportDeleteWorkflowAction.WORKFLOW_MONITOR_PATH, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally the WORKFLOW_DELEGATE_PATH
and WORKFLOW_MONITOR_PATH
constants should be in a central constants file for workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to utils class
f384415
to
038fc94
Compare
…gger Signed-off-by: Surya Sashank Nistala <[email protected]>
038fc94
to
54b9469
Compare
Signed-off-by: Surya Sashank Nistala <[email protected]>
…pi for fetching chained alerts - workflow alerts api Signed-off-by: Surya Sashank Nistala <[email protected]>
workflow.inputs.isNotEmpty() && workflow.inputs[0] is CompositeInput && | ||
(workflow.inputs[0] as CompositeInput).sequence.delegates.isNotEmpty() | ||
) { | ||
val i = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i
is always 0 and never incremented.
(workflow.inputs[0] as CompositeInput).sequence.delegates.isNotEmpty() | ||
) { | ||
val i = 0 | ||
val delegates = (workflow.inputs[i] as CompositeInput).sequence.delegates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be: val delegates = (workflow.inputs[0] as CompositeInput).sequence.delegates
?
getResponse = | ||
client.suspendUntil { | ||
client.get( | ||
GetRequest(ScheduledJob.SCHEDULED_JOBS_INDEX, delegates[0].monitorId), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be: GetRequest(ScheduledJob.SCHEDULED_JOBS_INDEX, delegates[i].monitorId),
?
else monitor.dataSources.alertsHistoryIndex!! | ||
} | ||
} | ||
} catch (e: java.lang.Exception) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: you can make this } catch (e: Exception) {
@@ -162,7 +161,7 @@ class TransportGetAlertsAction @Inject constructor( | |||
*/ | |||
suspend fun resolveAlertsIndexName(getAlertsRequest: GetAlertsRequest): String { | |||
var alertIndex = AlertIndices.ALL_ALERT_INDEX_PATTERN | |||
if (!getAlertsRequest.alertIndex.isNullOrEmpty()) { | |||
if (getAlertsRequest.alertIndex.isNullOrEmpty() == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? The previous way was cleaner.
alertIndex = monitor.dataSources.alertsIndex | ||
alertHistoryIndex = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if multiple monitors have different data sources and part of the same workflow? This wont move all the alerts and they get stuck defunct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this is only for alerts that are not going to be in the audit state, right? Since audit alerts should be in the history index automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workflows currently assume same datasources for all monitors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the workflow enforce a single data source for all monitors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are not doing a validation but right now there is no real world scenario where we add monitors with different data sources because-
rest apis dont support data sources and are populatd with alerting system indices
security analytics detectos have one data source each and wont intermix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are not doing a validation but right now there is no real world scenario where we add monitors with different data sources because-
rest apis dont support data sources and are populatd with alerting system indices
security analytics detectos have one data source each and wont intermix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a github issue to ensure workflows have monitors of a single data source only
Signed-off-by: Surya Sashank Nistala <[email protected]>
Signed-off-by: Surya Sashank Nistala <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #976 +/- ##
============================================
- Coverage 74.67% 73.63% -1.04%
- Complexity 111 113 +2
============================================
Files 144 158 +14
Lines 8362 9326 +964
Branches 1224 1365 +141
============================================
+ Hits 6244 6867 +623
- Misses 1499 1764 +265
- Partials 619 695 +76
|
launch { | ||
try { | ||
monitorCtx.moveAlertsRetryPolicy!!.retry(logger) { | ||
moveAlerts(monitorCtx.client!!, job.id, job, monitorCtx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check if the alerts index is initialized if its not using a different datasource. Move alerts checks only if the datasource alerts index is initialized if there is a data source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datasource is always there
for monitors created directly by alerting plugin through rest api or dashboards we populate them with defaults
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
common-utils-3.0.0.0-SNAPSHOT.jar
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
Signed-off-by: Surya Sashank Nistala <[email protected]>
@@ -123,7 +130,9 @@ class AlertService( | |||
fun composeQueryLevelAlert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we combine these Alert compositions into a single logic for common pieces to reduce duplicate code? Not a PR blocker - we can add to the backlog in META
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the logic to save an alert is common across all monitors.
this is the logic to create a new alert which IMO should be in different methods and doesn't need a change as they all use different constructors to set their corresponding set of fields in alerts
@@ -554,9 +609,12 @@ class AlertService( | |||
} | |||
} | |||
Alert.State.AUDIT -> { | |||
val index = if (alertIndices.isAlertHistoryEnabled()) { | |||
dataSources.alertsHistoryIndex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can flipping the alert history index enablement hinder the ability to read the history data during upgrades?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes that problem exists. we have an issue created to deal with it
* 3. Delete alerts from monitor's DataSources.alertsIndex | ||
* 4. Schedule a retry if there were any failures | ||
*/ | ||
suspend fun moveAlerts(client: Client, workflowId: String, workflow: Workflow?, monitorCtx: MonitorRunnerExecutionContext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ensure we are consistent and not adding regression during version upgrades. Two important things to consider:
- Serialization/De-Ser during node to node interaction (with old and new version of the plugin) is not breaking and new version is backward compatible.
- Data (History, Config and System/Hidden indices) are read/updated across versions without causing regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. There won't be regressions.
- The moveAlerts method is always called within a try catch block and will not cause any issue with cross version Serialization/De-Ser.
- We have added workflows as a new resource and have not edited any current resource.
- all fields related to workflows have been as optional fields in existing resources such as monitors alerts findings
and we have tests for alerts to verify parsing the old version alerts by the current version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Can we ensure all transport interactions are also adhering to it and are backward compatible. This is to ensure no regressions due to heterogeneous fleet during upgrades.
* Solves the Trigger Expression using the Reverse Polish Notation (RPN) based solver | ||
* @param polishNotation an array of expression tokens organized in the RPN order | ||
*/ | ||
class ChainedAlertRPNResolver( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering why we did not extend on the current TriggerExpressionResolver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i had started out to do that but midway realized I had a much simpler use case which relied on RPN resolution and needed a different method too.
We can explore extracting commonn logic out but method params were little different and I can do a follow up. Test classes have been added to verify the chained alerts parsing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks can we add to our backlog in the meta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multi nodes tests are running on my local
|
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-976-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3911ca91881888c0a823cb684817378d0522d5c1
# Push it to GitHub
git push --set-upstream origin backport/backport-976-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x Then, create a pull request where the |
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.9 2.9
# Navigate to the new working tree
cd .worktrees/backport-2.9
# Create a new branch
git switch --create backport/backport-976-to-2.9
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 3911ca91881888c0a823cb684817378d0522d5c1
# Push it to GitHub
git push --set-upstream origin backport/backport-976-to-2.9
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.9 Then, create a pull request where the |
* chained alert triggers Signed-off-by: Surya Sashank Nistala <[email protected]> * converge all single node test cases Signed-off-by: Surya Sashank Nistala <[email protected]> * add license headers to files Signed-off-by: Surya Sashank Nistala <[email protected]> * fix workflow not found issue Signed-off-by: Surya Sashank Nistala <[email protected]> * added audit state alerts for doc level monitors Signed-off-by: Surya Sashank Nistala <[email protected]> * add audit alerts in query level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * temp: upload custom built common utils jar Signed-off-by: Surya Sashank Nistala <[email protected]> * fix get monitor response parsing to include associated_workflows Signed-off-by: Surya Sashank Nistala <[email protected]> * add query level monitor audit alerts tests Signed-off-by: Surya Sashank Nistala <[email protected]> * add audit alerts in bucket level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix workflow tests Signed-off-by: Surya Sashank Nistala <[email protected]> * alerting Signed-off-by: Surya Sashank Nistala <[email protected]> * verify bucket monitor audit alerts and chained alerts in workflow Signed-off-by: Surya Sashank Nistala <[email protected]> * make execution id mandatory Signed-off-by: Surya Sashank Nistala <[email protected]> * revert mapping update in run job method Signed-off-by: Surya Sashank Nistala <[email protected]> * minor fixes in chained alert trigger result Signed-off-by: Surya Sashank Nistala <[email protected]> * fix chained alert triggers tests Signed-off-by: Surya Sashank Nistala <[email protected]> * fix acknowledge chained alert bug Signed-off-by: Surya Sashank Nistala <[email protected]> * revert get alerts change Signed-off-by: Surya Sashank Nistala <[email protected]> * refactor and remove transport actions being invoked in other transport actions Signed-off-by: Surya Sashank Nistala <[email protected]> * add license header Signed-off-by: Surya Sashank Nistala <[email protected]> * scheduled job mapping schema Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint and revert gradle dev set up chanegs Signed-off-by: Surya Sashank Nistala <[email protected]> * fix post delete method and refactor alert mover to add class level logger Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test - pass workflow id in get alerts Signed-off-by: Surya Sashank Nistala <[email protected]> * remove monitor empty filter in get alerts api as there is dedicated api for fetching chained alerts - workflow alerts api Signed-off-by: Surya Sashank Nistala <[email protected]> * fix check for workflow id is empty or null in get alerts action Signed-off-by: Surya Sashank Nistala <[email protected]> * fix alert mover method delegate monitor parsing logic Signed-off-by: Surya Sashank Nistala <[email protected]> * remove common utils jar from repo Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
* Adds chained alerts (#976) * chained alert triggers Signed-off-by: Surya Sashank Nistala <[email protected]> * converge all single node test cases Signed-off-by: Surya Sashank Nistala <[email protected]> * add license headers to files Signed-off-by: Surya Sashank Nistala <[email protected]> * fix workflow not found issue Signed-off-by: Surya Sashank Nistala <[email protected]> * added audit state alerts for doc level monitors Signed-off-by: Surya Sashank Nistala <[email protected]> * add audit alerts in query level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * temp: upload custom built common utils jar Signed-off-by: Surya Sashank Nistala <[email protected]> * fix get monitor response parsing to include associated_workflows Signed-off-by: Surya Sashank Nistala <[email protected]> * add query level monitor audit alerts tests Signed-off-by: Surya Sashank Nistala <[email protected]> * add audit alerts in bucket level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix workflow tests Signed-off-by: Surya Sashank Nistala <[email protected]> * alerting Signed-off-by: Surya Sashank Nistala <[email protected]> * verify bucket monitor audit alerts and chained alerts in workflow Signed-off-by: Surya Sashank Nistala <[email protected]> * make execution id mandatory Signed-off-by: Surya Sashank Nistala <[email protected]> * revert mapping update in run job method Signed-off-by: Surya Sashank Nistala <[email protected]> * minor fixes in chained alert trigger result Signed-off-by: Surya Sashank Nistala <[email protected]> * fix chained alert triggers tests Signed-off-by: Surya Sashank Nistala <[email protected]> * fix acknowledge chained alert bug Signed-off-by: Surya Sashank Nistala <[email protected]> * revert get alerts change Signed-off-by: Surya Sashank Nistala <[email protected]> * refactor and remove transport actions being invoked in other transport actions Signed-off-by: Surya Sashank Nistala <[email protected]> * add license header Signed-off-by: Surya Sashank Nistala <[email protected]> * scheduled job mapping schema Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint and revert gradle dev set up chanegs Signed-off-by: Surya Sashank Nistala <[email protected]> * fix post delete method and refactor alert mover to add class level logger Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test - pass workflow id in get alerts Signed-off-by: Surya Sashank Nistala <[email protected]> * remove monitor empty filter in get alerts api as there is dedicated api for fetching chained alerts - workflow alerts api Signed-off-by: Surya Sashank Nistala <[email protected]> * fix check for workflow id is empty or null in get alerts action Signed-off-by: Surya Sashank Nistala <[email protected]> * fix alert mover method delegate monitor parsing logic Signed-off-by: Surya Sashank Nistala <[email protected]> * remove common utils jar from repo Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * fix imports Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]>
* Adds chained alerts (#976) * chained alert triggers Signed-off-by: Surya Sashank Nistala <[email protected]> * converge all single node test cases Signed-off-by: Surya Sashank Nistala <[email protected]> * add license headers to files Signed-off-by: Surya Sashank Nistala <[email protected]> * fix workflow not found issue Signed-off-by: Surya Sashank Nistala <[email protected]> * added audit state alerts for doc level monitors Signed-off-by: Surya Sashank Nistala <[email protected]> * add audit alerts in query level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * temp: upload custom built common utils jar Signed-off-by: Surya Sashank Nistala <[email protected]> * fix get monitor response parsing to include associated_workflows Signed-off-by: Surya Sashank Nistala <[email protected]> * add query level monitor audit alerts tests Signed-off-by: Surya Sashank Nistala <[email protected]> * add audit alerts in bucket level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix workflow tests Signed-off-by: Surya Sashank Nistala <[email protected]> * alerting Signed-off-by: Surya Sashank Nistala <[email protected]> * verify bucket monitor audit alerts and chained alerts in workflow Signed-off-by: Surya Sashank Nistala <[email protected]> * make execution id mandatory Signed-off-by: Surya Sashank Nistala <[email protected]> * revert mapping update in run job method Signed-off-by: Surya Sashank Nistala <[email protected]> * minor fixes in chained alert trigger result Signed-off-by: Surya Sashank Nistala <[email protected]> * fix chained alert triggers tests Signed-off-by: Surya Sashank Nistala <[email protected]> * fix acknowledge chained alert bug Signed-off-by: Surya Sashank Nistala <[email protected]> * revert get alerts change Signed-off-by: Surya Sashank Nistala <[email protected]> * refactor and remove transport actions being invoked in other transport actions Signed-off-by: Surya Sashank Nistala <[email protected]> * add license header Signed-off-by: Surya Sashank Nistala <[email protected]> * scheduled job mapping schema Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint and revert gradle dev set up chanegs Signed-off-by: Surya Sashank Nistala <[email protected]> * fix post delete method and refactor alert mover to add class level logger Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test - pass workflow id in get alerts Signed-off-by: Surya Sashank Nistala <[email protected]> * remove monitor empty filter in get alerts api as there is dedicated api for fetching chained alerts - workflow alerts api Signed-off-by: Surya Sashank Nistala <[email protected]> * fix check for workflow id is empty or null in get alerts action Signed-off-by: Surya Sashank Nistala <[email protected]> * fix alert mover method delegate monitor parsing logic Signed-off-by: Surya Sashank Nistala <[email protected]> * remove common utils jar from repo Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * fix imports Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> (cherry picked from commit d2d03c6)
* Adds chained alerts (#976) * chained alert triggers Signed-off-by: Surya Sashank Nistala <[email protected]> * converge all single node test cases Signed-off-by: Surya Sashank Nistala <[email protected]> * add license headers to files Signed-off-by: Surya Sashank Nistala <[email protected]> * fix workflow not found issue Signed-off-by: Surya Sashank Nistala <[email protected]> * added audit state alerts for doc level monitors Signed-off-by: Surya Sashank Nistala <[email protected]> * add audit alerts in query level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * temp: upload custom built common utils jar Signed-off-by: Surya Sashank Nistala <[email protected]> * fix get monitor response parsing to include associated_workflows Signed-off-by: Surya Sashank Nistala <[email protected]> * add query level monitor audit alerts tests Signed-off-by: Surya Sashank Nistala <[email protected]> * add audit alerts in bucket level monitor Signed-off-by: Surya Sashank Nistala <[email protected]> * fix workflow tests Signed-off-by: Surya Sashank Nistala <[email protected]> * alerting Signed-off-by: Surya Sashank Nistala <[email protected]> * verify bucket monitor audit alerts and chained alerts in workflow Signed-off-by: Surya Sashank Nistala <[email protected]> * make execution id mandatory Signed-off-by: Surya Sashank Nistala <[email protected]> * revert mapping update in run job method Signed-off-by: Surya Sashank Nistala <[email protected]> * minor fixes in chained alert trigger result Signed-off-by: Surya Sashank Nistala <[email protected]> * fix chained alert triggers tests Signed-off-by: Surya Sashank Nistala <[email protected]> * fix acknowledge chained alert bug Signed-off-by: Surya Sashank Nistala <[email protected]> * revert get alerts change Signed-off-by: Surya Sashank Nistala <[email protected]> * refactor and remove transport actions being invoked in other transport actions Signed-off-by: Surya Sashank Nistala <[email protected]> * add license header Signed-off-by: Surya Sashank Nistala <[email protected]> * scheduled job mapping schema Signed-off-by: Surya Sashank Nistala <[email protected]> * fix ktlint and revert gradle dev set up chanegs Signed-off-by: Surya Sashank Nistala <[email protected]> * fix post delete method and refactor alert mover to add class level logger Signed-off-by: Surya Sashank Nistala <[email protected]> * fix test - pass workflow id in get alerts Signed-off-by: Surya Sashank Nistala <[email protected]> * remove monitor empty filter in get alerts api as there is dedicated api for fetching chained alerts - workflow alerts api Signed-off-by: Surya Sashank Nistala <[email protected]> * fix check for workflow id is empty or null in get alerts action Signed-off-by: Surya Sashank Nistala <[email protected]> * fix alert mover method delegate monitor parsing logic Signed-off-by: Surya Sashank Nistala <[email protected]> * remove common utils jar from repo Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> * fix imports Signed-off-by: Surya Sashank Nistala <[email protected]> --------- Signed-off-by: Surya Sashank Nistala <[email protected]> (cherry picked from commit d2d03c6) Co-authored-by: Surya Sashank Nistala <[email protected]>
Adds chained alerts - alerts created when chained alert trigger conditions in workflows are satisfied.
We use Painless scripting language to define chained alerts condition
Examples
monitor[id=1] && (!monitor[id=2] || monitor[id=3])
monitor[id=1] && monitor[id=2]
monitor[id=1] || monitor[id=2]
PR also introduces the logic for AUDIT state alerts :