-
Notifications
You must be signed in to change notification settings - Fork 36
Validation API Backend for AD everywhere #221
base: main
Are you sure you want to change the base?
Conversation
… calling randomsampling once internally
Newbranch validate 1.9
Codecov Report
@@ Coverage Diff @@
## master #221 +/- ##
============================================
- Coverage 72.41% 68.34% -4.08%
- Complexity 1290 1324 +34
============================================
Files 139 145 +6
Lines 6073 6593 +520
Branches 469 518 +49
============================================
+ Hits 4398 4506 +108
- Misses 1464 1874 +410
- Partials 211 213 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Unit test coverage drops and cause workflow to fail. Is it possible to write unit tests to cover it? Or can you let the workflow count your IT tests to the new coverage? |
more unit tests added for models
uiMetadata, | ||
schemaVersion, | ||
lastUpdateTime, | ||
true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the only place that the method is different from the parse method?
protected static final String AGG_NAME_MAX = "max_timefield"; | ||
protected static final int NUM_OF_INTERVAL_SAMPLES = 128; | ||
protected static final int MAX_NUM_OF_SAMPLES_VIEWED = 128; | ||
protected static final int NUM_OF_INTERVALS_CHECKED = 256; | ||
protected static final double SAMPLE_SUCCESS_RATE = 0.75; | ||
protected static final int FEATURE_VALIDATION_TIME_BACK_MINUTES = 10080; | ||
protected static final int NUM_OF_INTERVALS_CHECKED_FILTER = 384; | ||
protected static final long MAX_INTERVAL_LENGTH = (30L * 24 * 60 * 60 * 1000); | ||
protected static final long HISTORICAL_CHECK_IN_MS = (90L * 24 * 60 * 60 * 1000); | ||
protected static final String NAME_REGEX = "[a-zA-Z0-9._-]+"; | ||
protected static final double INTERVAL_RECOMMENDATION_MULTIPLIER = 1.2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add documentation on how the numbers are determined?
} | ||
String error = RestHandlerUtils.validateAnomalyDetector(anomalyDetector, maxAnomalyFeatures); | ||
if (StringUtils.isNotBlank(error)) { | ||
List<String> dupErrorsFeatures = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error returned by RestHandlerUtils.validateAnomalyDetector is not just about duplicate feature. Can we rename related variable and enum?
|
||
private void checkADNameExists() { | ||
BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder(); | ||
boolQueryBuilder.must(QueryBuilders.termQuery("name.keyword", anomalyDetector.getName())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have used the constant "name.keyword" in src/main/java/com/amazon/opendistroforelasticsearch/ad/rest/handler/IndexAnomalyDetectorActionHandler.java as well. Can you declare a variable in sth like com.amazon.opendistroforelasticsearch.ad.constant.CommonName and reference it?
.aggregation(AggregationBuilders.max(AGG_NAME_MAX).field(anomalyDetector.getTimeField())) | ||
.size(1) | ||
.sort(new FieldSortBuilder(anomalyDetector.getTimeField()).order(SortOrder.DESC)); | ||
SearchRequest searchRequest = new SearchRequest().indices(anomalyDetector.getIndices().get(0)).source(searchSourceBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have multiple indices?
Map<String, List<Map<String, ?>>> suggestionsMap = (Map<String, List<Map<String, ?>>>) XContentMapValues | ||
.extractValue("suggestedChanges", responseMap); | ||
assertTrue(failuresMap.keySet().size() == 1); | ||
assertTrue(failuresMap.containsKey("others")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also check the value?
.extractValue("suggestedChanges", responseMap); | ||
assertTrue(failuresMap.keySet().size() == 0); | ||
assertTrue(suggestionsMap.keySet().size() == 1); | ||
assertTrue(suggestionsMap.containsKey("detection_interval")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also check the value?
.extractValue("suggestedChanges", responseMap); | ||
assertTrue(failuresMap.keySet().size() == 0); | ||
assertTrue(suggestionsMap.keySet().size() == 1); | ||
assertTrue(suggestionsMap.containsKey("feature_attributes")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check the value?
XContentBuilder xContentBuilder = builder.startObject(); | ||
|
||
xContentBuilder.startObject("failures"); | ||
for (String key : failures.keySet()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if failures or suggestedChanges are null?
.xContent() | ||
.createParser(xContent, LoggingDeprecationHandler.INSTANCE, feature.getAggregation().toString()); | ||
parser.nextToken(); | ||
List<String> fieldNames = parseAggregationRequest(parser); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if fieldNames's size is 0?
src/main/java/com/amazon/opendistroforelasticsearch/ad/rest/handler/ValidateAnomalyDetectorActionHandler.java may require more unit tests. |
Description of changes:
Added a new API endpoint to AD in order to validate anomaly detector configurations and return back if any failures or suggestions occurred. The validation API takes an input of detector configurations and then validates it against both general checks and against the data source to know if the configurations will likely lead to successful anomaly detector creation.
Endpoint: Post _opendistro/_anomaly_detection/detectors/_validate
Input : Anomaly Detector configurations
image
Overall the validation checks if any fields are missing, if any data can be found with the given filter query and feature query. I also check whether the feature aggregation type is valid for the field type. The last two steps of the validation, checks if the detector interval used will produce data that is dense enough, recommending a different interval if this isn't the case in milliseconds. The API also checks for the last seen data point and returns it as the window delay recommendation.
Example input for detector interval that is too short :
Output:
Testing Done:
Integration testing added
manually tested that API works with both dense and sparse sample data and some of the tms-issue data.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.