-
Notifications
You must be signed in to change notification settings - Fork 5
Updates for changes in DataFusion #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I tried compiling the latest datafusion-cli for testing and found the following output: > SELECT - 69 FROM tab0 WHERE + - col1 NOT IN ( - + col1, CAST ( + col0 AS INTEGER ), + + ( + + col1 ) / + - 0 * + col0 );
+------------+
| Int64(-69) |
+------------+
+------------+
0 row(s) fetched.
> explain SELECT - 69 FROM tab0 WHERE + - col1 NOT IN ( - + col1, CAST ( + col0 AS INTEGER ), + + ( + + col1 ) / + - 0 * + col0 );
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan |
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan | Projection: Int64(-69) |
| | Projection: |
| | Filter: __common_expr_4 != __common_expr_4 AND __common_expr_4 != CAST(CAST(tab0.col0 AS Int32) AS Int64) AND __common_expr_4 != CAST(tab0.col1 AS Int64) / Int64(0) * CAST(tab0.col0 AS Int64) |
| | Projection: CAST((- tab0.col1) AS Int64) AS __common_expr_4, tab0.col0, tab0.col1 |
| | TableScan: tab0 projection=[col0, col1] |
| physical_plan | ProjectionExec: expr=[-69 as Int64(-69)] |
| | ProjectionExec: expr=[] |
| | CoalesceBatchesExec: target_batch_size=8192 |
| | FilterExec: __common_expr_4@0 != __common_expr_4@0 AND __common_expr_4@0 != CAST(col0@1 AS Int64) AND __common_expr_4@0 != CAST(col1@2 AS Int64) / 0 * CAST(col0@1 AS Int64) |
| | ProjectionExec: expr=[CAST((- col1@1) AS Int64) as __common_expr_4, col0@0 as col0, col1@1 as col1] |
| | DataSourceExec: partitions=1, partition_sizes=[0] |
| | |
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ Based on the plan, it should be because the expression involving division by zero was short-circuited, resulting in zero records being output. However, there is a question here: If the downstream operator does not output any data, should the constant directly returned by the projection be output normally? Or can this kind of SQL be optimized to only leave projection after the plan phase? I tried the same query in DuckDB, and it also returned empty, so this might be expected behavior. |
If you mean that the WHERE clause filters out all rows, should a select list that has only constants still output a row, I think the answer is no I agree this is inconsistent when there isn't any WHERE clause or FROM clause (and so no input rows) in which case the select list generates a single row 🤷 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Omega359 and @acking-you
DataFusion issue: apache/datafusion#15641
Related to: apache/datafusion#15462
I've run the regenerate script based on the latest code. I am unsure to be honest if the results in the diff are actually correct - it looks like the changes allowed a few queries to pass but the query has no results? Someone should manually investigate these I think to confirm their correctness and if they are incorrect, file a ticket in DF to have them fixed.