-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added logic to control which stages must be pushed down or skipped from SQL engine execution. #14061
Conversation
6faf269
to
d89961f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests
protected Iterable<SparkCollectionRelationalEngine> getRelationalEngines(SparkCollection<Object> stageData) { | ||
if (sqlEngineAdapter == null || !sqlEngineAdapter.supportsRelationalTranform() | ||
|| !(stageData instanceof SQLBackedCollection)) { | ||
protected Iterable<SparkCollectionRelationalEngine> getRelationalEngines(String stageName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there cases where stageSpec.getName() != stageName
? Please add new params to javadoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually no, there are not. Let me update the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
|
||
// If this stage is not pushed down and it's not in the included stages, we can skip relational transformation | ||
// in the SQL engine. | ||
if (!(stageData instanceof SQLBackedCollection || shouldForcePushToSQLEngine(stageName))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I am wondering if it would be more readable as !A && !B
vs !(A || B)
. It would more directly correspond to the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
d89961f
to
8bfd279
Compare
@tivv I've added tests. |
8bfd279
to
7cef84c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall good, could you please also support "excluded" for io.cdap.cdap.etl.spark.batch.SQLBackedCollection#tryStoreDirect ? It would allow to skiip it in case of any issues
…om SQL engine execution. Added additional methods to ensure that input and output schemas for stages are supported by the SQL engine. This is useful when verifying if all input and output types in an aggregation are supported by the SQL engine.
7cef84c
to
3ee9301
Compare
I've added logic to check if the sink stage is in the "Excluded" list before attempting to write to the sink. |
Added additional methods to ensure that input and output schemas for stages are supported by the SQL engine. This is useful when verifying if all input and output types in an aggregation are supported by the SQL engine.