Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude outliers from training in volume tests #1589

Open
bebbo203 opened this issue Jul 4, 2024 · 1 comment
Open

Exclude outliers from training in volume tests #1589

bebbo203 opened this issue Jul 4, 2024 · 1 comment

Comments

@bebbo203
Copy link

bebbo203 commented Jul 4, 2024

Is your feature request related to a problem? Please describe.
When performing volume tests, it may happen that a sudden spike (or drop) in a bucket will completey change the training, increasing the possibility of false negatives

Describe the solution you'd like
Points that have generated a failure should be excluded from the training set

Describe alternatives you've considered
By correctly setting anomaly_exclude_metrics, is it possible to achieve the same but the point will not be visualized in the dashboard.
Example:
anomaly_exclude_metrics: ((metric_value - AVG(metric_value) OVER (partition by metric_name, full_table_name, column_name, dimension, dimension_value order by bucket_end asc rows between unbounded preceding and current row)) / STDDEV(metric_value) OVER (partition by metric_name, full_table_name, column_name, dimension, dimension_value order by bucket_end asc rows between unbounded preceding and current row)) >= 2

Additional context
image

The top graph is the one with the modified anomaly_exclude_metrics. It should visualize all the data points but only exclude the Jul 2 point from the training

Would you be willing to contribute this feature?
Currently I don't have much time but it could be possible in the future

@haritamar
Copy link
Collaborator

Hi @bebbo203 , thanks for opening this issue.
Totally makes sense and it's definitely something we're considering. anomaly_exclude_metrics is a workaround but definitely not the ideal one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants