Skip to content

Provide documentation of expose APIs to enable handling of type coercion at UNION plan construction. #12142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 26, 2024

Conversation

wiedld
Copy link
Contributor

@wiedld wiedld commented Aug 23, 2024

Which issue does this PR close?

Closes #12105

Rationale for this change

We construct our own logical plans for a SQL-derivative languages (e.g. InfluxQL). The construction of logic plans using UNION requires that upon construction of the union plan node, we type-coerce the expressions prior to building the rest of the plan. If we don't perform this expression rewrite, then our consuming LIMIT, GROUP BY, ORDER BY, and aggregates will not be built properly.

We do not think our use case is unique, and therefore we are (a) exposing the APIs need to perform this coercion and (b) providing docs.

What changes are included in this PR?

  • Expose the expression rewriter.
  • expose helper method coerce_union_schema
  • provide docs delineating the different type-coercion interfaces:
    • TypeCoercion
    • vs TypeCoercionRewriter
    • vs helper coerce_union_schema.
  • update the union() API, which does the logical plan construction , as to the how/when/why to use type coercion.

Are these changes tested?

No code changes.

Are there any user-facing changes?

No. Only more exposed interfaces to use.

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels Aug 23, 2024
@wiedld
Copy link
Contributor Author

wiedld commented Aug 23, 2024

Note to @alamb -- exposing the rewriter api (as suggested here), enables the union type-coercion with less of the schema shenangians and overhead (vs the full analyzer). Gives the user options.

@wiedld wiedld marked this pull request as ready for review August 23, 2024 21:43
/// or alternatively, merge the union input schema using [`coerce_union_schema`] and
/// apply the expression rewrite with [`coerce_plan_expr_for_schema`].
///
/// [`TypeCoercionRewriter::coerce_union`]: https://docs.rs/datafusion-optimizer/latest/datafusion_optimizer/analyzer/type_coercion/struct.TypeCoercionRewriter.html#method.coerce_union
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @wiedld
There are couple of projects using DF without optimizer, so I'm thinking should we add the coerced union recommendation to the error message....

@wiedld
Copy link
Contributor Author

wiedld commented Aug 26, 2024

I'm thinking should we add the coerced union recommendation to the error message....

@comphead -- I'm not sure which error message you are referring to? 😅 🙏🏼

@comphead
Copy link
Contributor

I'm thinking should we add the coerced union recommendation to the error message....

@comphead -- I'm not sure which error message you are referring to? 😅 🙏🏼

Ignore it, for downstream projects not using the optimizer it would be difficult to understand why UNION crashed, but I believe the documentation is enough

@comphead comphead merged commit 533ddcb into apache:main Aug 26, 2024
29 checks passed
@alamb alamb deleted the 12105/unions-coerce-at-construction branch August 26, 2024 19:59
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @wiedld and @comphead

@@ -56,6 +56,8 @@ use datafusion_expr::{
Projection, ScalarUDF, Union, WindowFrame, WindowFrameBound, WindowFrameUnits,
};

/// Performs type coercion by determining the schema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle downstream impacts to union's behavioral changes.
4 participants