-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CDAP-16739] Added hard limit for Avro data file, proper error messag… #406
base: develop
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
/* | ||
* Copyright © 2017-2019 Cask Data, Inc. | ||
* Copyright © 2017-2020 Cask Data, Inc. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); you may not | ||
* use this file except in compliance with the License. You may obtain a copy of | ||
|
@@ -52,6 +52,40 @@ public void testParseAsAvroFile() throws Exception { | |
Assert.assertEquals(1495194308245L, results.get(1688).getValue("timestamp")); | ||
} | ||
|
||
@Test | ||
public void testAvroWithJsonFieldParsingWithParseJsonIgnoreError() throws Exception { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these new tests are testing two directives. More specifically, these are not testing any change in behavior in the avro directive, they are just testing the changes in json directive. Unit tests should be testing isolated components when possible. It would be better to enhance the parse json test. For example, if we remove the parse avro file directive, we could easily remove test coverage of the parse json directive accidentally. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was a problem customer had. We have other tests for catching parse-as-json problems. This was when a json in avro file have. Using redacted user data as test. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and this is a unit test, meaning it should be testing very specific parts of the code. With the ignore-error changed introduced, it should be testing that the json directive returns empty results when given bad data and ignore-error is true. |
||
InputStream stream = ParseAvroFileTest.class.getClassLoader().getResourceAsStream("bad.avro"); | ||
byte[] data = IOUtils.toByteArray(stream); | ||
|
||
String[] directives = new String[] { | ||
"parse-as-avro-file body", | ||
"parse-as-json :message_contents 2 true" | ||
}; | ||
|
||
List<Row> rows = new ArrayList<>(); | ||
rows.add(new Row("body", data)); | ||
|
||
List<Row> results = TestingRig.execute(directives, rows); | ||
Assert.assertEquals(6693, results.size()); // total 6670, 7 records are bad. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these types of asserts are not specific enough. For example, how can we be sure the right rows are filtered? It would be better to just send a single row in as input. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using customer test data to make sure we don't break it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this doesn't make sure of anything besides the size |
||
} | ||
|
||
@Test | ||
public void testAvroWithJsonFieldParsingWithParseJsonIgnoreErrorFail() throws Exception { | ||
InputStream stream = ParseAvroFileTest.class.getClassLoader().getResourceAsStream("bad.avro"); | ||
byte[] data = IOUtils.toByteArray(stream); | ||
|
||
String[] directives = new String[] { | ||
"parse-as-avro-file body", | ||
"parse-as-json :message_contents 2" | ||
}; | ||
|
||
List<Row> rows = new ArrayList<>(); | ||
rows.add(new Row("body", data)); | ||
|
||
List<Row> results = TestingRig.execute(directives, rows); | ||
Assert.assertEquals(0, results.size()); // total 6670, 7 records are bad. | ||
} | ||
|
||
@Test(expected = RecipeException.class) | ||
public void testIncorrectType() throws Exception { | ||
String[] directives = new String[] { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't rows always size 1? Otherwise throwing an ErrorRowException would incorrectly mark every Row as an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not always. If we parsing AvroFile. Parse-as-avro-file will generate records > 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, but the RecipeExecutor will break that into multiple lists of size 1. Either this is always size 1, or the entire ErrorRowException contract is wrong.