Skip to content

Updates for changes in DataFusion #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 9, 2025
Merged

Updates for changes in DataFusion #9

merged 1 commit into from
Apr 9, 2025

Conversation

Omega359
Copy link
Contributor

@Omega359 Omega359 commented Apr 8, 2025

DataFusion issue: apache/datafusion#15641

Related to: apache/datafusion#15462

I've run the regenerate script based on the latest code. I am unsure to be honest if the results in the diff are actually correct - it looks like the changes allowed a few queries to pass but the query has no results? Someone should manually investigate these I think to confirm their correctness and if they are incorrect, file a ticket in DF to have them fixed.

@acking-you
Copy link

acking-you commented Apr 9, 2025

it looks like the changes allowed a few queries to pass but the query has no results?

I tried compiling the latest datafusion-cli for testing and found the following output:

> SELECT - 69 FROM tab0 WHERE + - col1 NOT IN ( - + col1, CAST ( + col0 AS INTEGER ), + + ( + + col1 ) / + - 0 * + col0 );
+------------+
| Int64(-69) |
+------------+
+------------+
0 row(s) fetched. 

> explain SELECT - 69 FROM tab0 WHERE + - col1 NOT IN ( - + col1, CAST ( + col0 AS INTEGER ), + + ( + + col1 ) / + - 0 * + col0 );
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                                                                                                                                |
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan  | Projection: Int64(-69)                                                                                                                                                                              |
|               |   Projection:                                                                                                                                                                                       |
|               |     Filter: __common_expr_4 != __common_expr_4 AND __common_expr_4 != CAST(CAST(tab0.col0 AS Int32) AS Int64) AND __common_expr_4 != CAST(tab0.col1 AS Int64) / Int64(0) * CAST(tab0.col0 AS Int64) |
|               |       Projection: CAST((- tab0.col1) AS Int64) AS __common_expr_4, tab0.col0, tab0.col1                                                                                                             |
|               |         TableScan: tab0 projection=[col0, col1]                                                                                                                                                     |
| physical_plan | ProjectionExec: expr=[-69 as Int64(-69)]                                                                                                                                                            |
|               |   ProjectionExec: expr=[]                                                                                                                                                                           |
|               |     CoalesceBatchesExec: target_batch_size=8192                                                                                                                                                     |
|               |       FilterExec: __common_expr_4@0 != __common_expr_4@0 AND __common_expr_4@0 != CAST(col0@1 AS Int64) AND __common_expr_4@0 != CAST(col1@2 AS Int64) / 0 * CAST(col0@1 AS Int64)                  |
|               |         ProjectionExec: expr=[CAST((- col1@1) AS Int64) as __common_expr_4, col0@0 as col0, col1@1 as col1]                                                                                         |
|               |           DataSourceExec: partitions=1, partition_sizes=[0]                                                                                                                                         |
|               |                                                                                                                                                                                                     |
+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Based on the plan, it should be because the expression involving division by zero was short-circuited, resulting in zero records being output.

However, there is a question here: If the downstream operator does not output any data, should the constant directly returned by the projection be output normally?

Or can this kind of SQL be optimized to only leave projection after the plan phase?

I tried the same query in DuckDB, and it also returned empty, so this might be expected behavior.

@alamb
Copy link
Contributor

alamb commented Apr 9, 2025

However, there is a question here: If the downstream operator does not output any data, should the constant directly returned by the projection be output normally?

If you mean that the WHERE clause filters out all rows, should a select list that has only constants still output a row, I think the answer is no

I agree this is inconsistent when there isn't any WHERE clause or FROM clause (and so no input rows) in which case the select list generates a single row 🤷

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Omega359 and @acking-you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants