Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds WindowUDFImpl::reverse_exprtrait method + Support for IGNORE NULLS #12662

Merged
merged 5 commits into from
Sep 29, 2024

Conversation

jcsherin
Copy link
Contributor

@jcsherin jcsherin commented Sep 28, 2024

Which issue does this PR close?

Part of #8709.
Closes #12661.

Rationale for this change

  1. So that a user-defined window function can customize the implementation for evaluating results in reverse order, allowing the optimizer to avoid sorting the data.
  2. Check if IGNORE NULLS is set.
  3. Easier to review a smaller diff compared to a PR converting an existing built-in window function.

What changes are included in this PR?

  1. Extends WindowUDFExpr with two new state fields for tracking if a window expression is reversed, and if IGNORE NULLS is used in the query.
/// Implements [`BuiltInWindowFunctionExpr`] for [`WindowUDF`]
#[derive(Clone, Debug)]
struct WindowUDFExpr {
    ...,
    /// This is set to `true` only if the user-defined window function
    /// expression supports evaluation in reverse order, and the
    /// evaluation order is reversed.
    is_reversed: bool,
    /// Set to `true` if `IGNORE NULLS` is defined, `false` otherwise.
    ignore_nulls: bool,
}
  1. Adds a new trait method WindowUDFImpl::reverse_expr with a default implementation.
pub trait WindowUDFImpl: Debug + Send + Sync {

    /// Allows customizing the behavior of the user-defined window
    /// function when it is evaluated in reverse order.
    fn reverse_expr(&self) -> ReversedUDWF {
        ReversedUDWF::NotSupported
    }
}

pub enum ReversedUDWF {
    /// The result of evaluating the user-defined window function
    /// remains identical when reversed.
    Identical,
    /// A window function which does not support evaluating the result
    /// in reverse order.
    NotSupported,
    /// Customize the user-defined window function for evaluating the
    /// result in reverse order.
    Reversed(Arc<WindowUDF>),
}
  1. Thread ignore_nulls when creating user-defined window expressions.
fn create_udwf_window_expr(
    fun: &Arc<WindowUDF>,
    args: &[Arc<dyn PhysicalExpr>],
    input_schema: &Schema,
    name: String,
    ignore_nulls: bool,  // <-- here
) -> Result<Arc<dyn BuiltInWindowFunctionExpr>> {
   ...
}

Are these changes tested?

Yes, against existing tests in CI.
No.

When built-in functions like lead, lag, nth_value etc. are converted, this API will get tested by existing sqllogictests.

Are there any user-facing changes?

Yes, this is a breaking API change: WindowUDFImpl::reverse_expr
No (default implementation provided for new trait method WindowUDFImpl::reverse_expr).

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Physical Expressions labels Sep 28, 2024
@jcsherin jcsherin changed the title Implements WindowUDFImpl::reverse_expr + Support for IGNORE NULLS Adds WindowUDFImpl::reverse_expr + Support for IGNORE NULLS Sep 28, 2024
@jcsherin jcsherin changed the title Adds WindowUDFImpl::reverse_expr + Support for IGNORE NULLS Adds WindowUDFImpl::reverse_exprtrait method + Support for IGNORE NULLS Sep 28, 2024
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love it -- thank you @jcsherin

One thing I noticed is that there are no tests for this functionality (as in if we removed / broke the code no tests would fail). Is the idea that it will be test coverage added when we port some of the built in window functions?

@@ -351,6 +359,24 @@ pub trait WindowUDFImpl: Debug + Send + Sync {
fn coerce_types(&self, _arg_types: &[DataType]) -> Result<Vec<DataType>> {
not_impl_err!("Function {} does not implement coerce_types", self.name())
}

/// Allows customizing the behavior of the user-defined window
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 -- this is a very nice API

@jcsherin
Copy link
Contributor Author

One thing I noticed is that there are no tests for this functionality (as in if we removed / broke the code no tests would fail). Is the idea that it will be test coverage added when we port some of the built in window functions?

Yes, this situation is temporary. When one of the built-in window functions like lead, lag, nth_value etc. are converted this API path will get tested by existing sqllogictests.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jcsherin -- the temporary nature of this code makes sense to me

/// Allows customizing the behavior of the user-defined window
/// function when it is evaluated in reverse order.
fn reverse_expr(&self) -> ReversedUDWF {
ReversedUDWF::NotSupported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the newly added trait method returns a default value, I don't think this is a breaking API change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it 👍

@alamb alamb merged commit a0a635a into apache:main Sep 29, 2024
36 checks passed
@alamb
Copy link
Contributor

alamb commented Sep 29, 2024

Thank you @jcsherin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add WindowUDFImpl::reverse_expr + support for IGNORE NULLS
2 participants