Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CTEs in sub-query reducer #1504

Open
wants to merge 3 commits into
base: p--cte--08
Choose a base branch
from
Open

Support CTEs in sub-query reducer #1504

wants to merge 3 commits into from

Conversation

plypaul
Copy link
Contributor

@plypaul plypaul commented Nov 4, 2024

This updates the sub-query reducer to support CTEs. The SELECT statement in each CTE is treated like a SELECT, and all CTEs are retained. i.e. a CTE is not reduced with another CTE. The CTEs are retained to make mapping to dataflow plan nodes easier in later PRs.

Previously, "parent nodes" was used for either nodes in the FROM clause or the
JOIN clause. With the addition of CTEs that are also considered parent nodes,
this renames a few variables / updates some conditionals for clarity.
@cla-bot cla-bot bot added the cla:yes label Nov 4, 2024
@plypaul plypaul changed the title Update sub-query reducer to support CTEs. Update sub-query reducer to support CTEs Nov 4, 2024
@plypaul plypaul changed the title Update sub-query reducer to support CTEs Support CTEs in sub-query reducer Nov 4, 2024
@plypaul plypaul marked this pull request as ready for review November 5, 2024 21:34
@@ -196,7 +199,7 @@ def _is_simple_source(node: SqlSelectStatementNode) -> bool:
if select_column.expr.lineage.contains_aggregate_exprs:
return False
return (
len(node.parent_nodes) <= 1
len(node.join_descs) == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a behavior change - were there changes to the snapshots from this change alone?

@@ -612,10 +615,11 @@ def visit_select_statement_node(self, node: SqlSelectStatementNode) -> SqlQueryP
# JOIN dim_listings c
# ON a.listing_id = b.listing_id

assert len(node_with_reduced_parents.parent_nodes) == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason to remove this assertion?

@@ -579,6 +578,13 @@ def _rewrite_node_with_join(node: SqlSelectStatementNode) -> SqlSelectStatementN
select_columns=tuple(clauses_to_rewrite.select_columns),
from_source=from_source,
from_source_alias=from_source_alias,
cte_sources=tuple(
SqlCteNode.create(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use cte_source.with_new_select?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants