Skip to content

Attach Diagnostic to "wrong number of arguments" error #14432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Tracked by #14429
eliaperantoni opened this issue Feb 3, 2025 · 15 comments · May be fixed by #15451 or #15490
Open
Tracked by #14429

Attach Diagnostic to "wrong number of arguments" error #14432

eliaperantoni opened this issue Feb 3, 2025 · 15 comments · May be fixed by #15451 or #15490
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@eliaperantoni
Copy link
Contributor

Is your feature request related to a problem or challenge?

For a query like:

SELECT sum(1, 2)

The only message that the end user of an application built atop of DataFusion sees is:

Error during planning: Execution error: User-defined coercion failed with Execution("SUM expects exactly one argument") No function matches the given name and argument types 'sum(Int64, Int64)'. You might need to add explicit type casts.
        Candidate functions:
        sum(UserDefined)

We want to provide a richer message that references and highlights locations in the original SQL query, and contextualises and helps the user understand the error. In the end, it would be possible to display errors in a fashion akin to what was enabled by #13664 for some errors:

See #14429 for more information.

Describe the solution you'd like

Attach a well crafted Diagnostic to the DataFusionError, building on top of the foundations laid in #13664. See #14429 for more information.

Describe alternatives you've considered

No response

Additional context

No response

@eliaperantoni eliaperantoni added the enhancement New feature or request label Feb 3, 2025
@eliaperantoni eliaperantoni changed the title Attach Diagnostic to "wrong number of arguments" error Attach Diagnostic to "wrong number of arguments" error Feb 3, 2025
@Chen-Yuan-Lai
Copy link
Contributor

take

@alamb
Copy link
Contributor

alamb commented Feb 4, 2025

I think this is a good first issue as the need is clear and the tests in https://github.com/apache/datafusion/blob/85fbde2661bdb462fc498dc18f055c44f229604c/datafusion/sql/tests/cases/diagnostic.rs are well structured for extension.

@alamb alamb added the good first issue Good for newcomers label Feb 4, 2025
@eliaperantoni
Copy link
Contributor Author

Hey @Chen-Yuan-Lai how is it going with this ticket :) Can I help with anything?

@Chen-Yuan-Lai
Copy link
Contributor

Hi @eliaperantoni sorry for the long delay. I’ve been taking some time to familiarize myself with this issue. I found that the "wrong number of argument" error may occur in different places depending on what function is called. For example,

  1. select sum(1, 2); (TypeSignature::UserDefined) : error message is "Execution error: sum function requires 1 argument, got 2"
  2. select ascii('abc', 'def'); (TypeSignature::String or other) : error message is "Function 'ascii' expects 1 arguments but received 2"

Should I:

  1. Implement a function to check the number of arguments for all the TypeSignature variants, and attach corresponding Diagnostic to DataFusionError?
  2. Attach Diagnostic to DataFusionError in different places?

Another stupid question is how to test the queries above correctly because I always get "expected diagnostic" error even though I have attached Diagnostic

return plan_err!(
                "Function '{function_name}' expects {expected_length} arguments but received {length}"
            ).map_err(|err| {
                err.with_diagnostic(
                    Diagnostic::new_error(
                        format!("Function '{function_name}' expects {expected_length} arguments but received {length}"),
                        None,

                    )
                )
            });

Is something wrong with this or do I need to do something else?

These are problems I met in this issue, I am looking forward to some hints or ideas, thanks so mush!

@eliaperantoni
Copy link
Contributor Author

eliaperantoni commented Feb 14, 2025

Hey @Chen-Yuan-Lai, absolutely no problem! Was checking in to see if you needed any help 😊. I see, that's a bit unfortunate. It seems like the first message is produced by a function called take_function_args and the second by one called function_length_check.

I see that take_function_args is receiving a lot of attention lately #14525. Perhaps you should attach a Diagnostic in the body of that function, and maybe you could even replace that where function_length_check is used, so that you can remove it?

Then the problem would be to get the Span to take_function_args, but maybe you could make it take an optional function_call_site: Option<Span> argument and then pass it in wherever possible, but without requiring all callers of take_function_args to pass it, because maybe sometimes it's just being used for a quick assertion.

I always get "expected diagnostic" error even though I have attached Diagnostic

That probably has something to do with the implementation of DataFusionError::iter and DataFusionError::diagnostic: they might not be able to unwrap the Diagnostic from the layers that compose the error. Have you tried dbg!-ing the error that you get in the test, to see if it contains a DataFusionError::Diagnostic somewhere? Perhaps the place where an error is created, that you attach a Diagnostic to, is not the first one that's triggered for a bad query let's say.

@Chen-Yuan-Lai
Copy link
Contributor

@eliaperantoni thanks for the advice! I will try that as soon as possible

@Chen-Yuan-Lai
Copy link
Contributor

Chen-Yuan-Lai commented Feb 20, 2025

Hi @eliaperantoni I found that Diagnostic information may lost when the error is wrapped in multiple layers. For example, the "wrong number of argument" error of sum function is wrapped in three locations:

  1. take_function_args

    pub fn take_function_args<const N: usize, T>(
    function_name: &str,
    args: impl IntoIterator<Item = T>,
    ) -> Result<[T; N]> {
    let args = args.into_iter().collect::<Vec<_>>();
    args.try_into().map_err(|v: Vec<T>| {
    _exec_datafusion_err!(
    "{} function requires {} {}, got {}",
    function_name,
    N,
    if N == 1 { "argument" } else { "arguments" },
    v.len()
    )

  2. get_valid_types_with_scalar_udf

    fn get_valid_types_with_aggregate_udf(
    signature: &TypeSignature,
    current_types: &[DataType],
    func: &AggregateUDF,
    ) -> Result<Vec<Vec<DataType>>> {
    let valid_types = match signature {
    TypeSignature::UserDefined => match func.coerce_types(current_types) {
    Ok(coerced_types) => vec![coerced_types],
    Err(e) => {
    return exec_err!(
    "Function '{}' user-defined coercion failed with {:?}",

  3. get_type

    Expr::AggregateFunction(AggregateFunction {
    func,
    params: AggregateFunctionParams { args, .. },
    }) => {
    let data_types = args
    .iter()
    .map(|e| e.get_type(schema))
    .collect::<Result<Vec<_>>>()?;
    let new_types = data_types_with_aggregate_udf(&data_types, func)
    .map_err(|err| {
    plan_datafusion_err!(
    "{} {}",
    match err {
    DataFusionError::Plan(msg) => msg,
    err => err.to_string(),
    },
    utils::generate_signature_error_msg(
    func.name(),
    func.signature().clone(),
    &data_types
    )
    )

Once the error macro (ex. plan_datafusion_err!, exec_err!) is called to wrap the error from the inner layer, a new DatafusionError will be reproduced, that is why I can't capture the Diagnostic in the unit test.

This is the DatafusionError in these three locations (use dbg!), we can see Diagnostic lost in the first layer.

[datafusion/expr/src/type_coercion/functions.rs:304:17] &e = Diagnostic(
    Diagnostic {
        kind: Error,
        message: "Wrong number of arguments for sum function call",
        span: None,
        notes: [],
        helps: [],
    },
    Execution(
        "sum function requires 1 argument, got 2",
    ),
)
[datafusion/expr/src/expr_schema.rs:166:25] &err = Execution(
    "Function 'sum' user-defined coercion failed with \"Execution error: sum function requires 1 argument, got 2\"",
)
[datafusion/sql/tests/cases/diagnostic.rs:52:17] &err = Plan(
    "Execution error: Function 'sum' user-defined coercion failed with \"Execution error: sum function requires 1 argument, got 2\" No function matches the given name and argument types 'sum(Int64, Int64)'. You might need to add explicit type casts.\n\tCandidate functions:\n\tsum(UserDefined)",
)

I think there are two solutions:

  1. modify error macro to attach Diagnostic .
  2. reattach Diagnostic again and again in the error chain.

But 1. seems to have a large effect on the codebase, I'm not sure which one is better. I hope to get some suggestions or other best practices, thx so much!!

@eliaperantoni
Copy link
Contributor Author

Hey @Chen-Yuan-Lai, thank you so much for your contributions 🙏 That is indeed an annoying issue. It seems like some of the error macros "flatten" the inner error to a string.

I think I'm in favour of option 2, in hope that there are not too many places where this flattening happens. I wouldn't go for option 1 because, as you say, it would change all existing macro invocations.

Perhaps we want is a DataFusionError::map_diagnosticed_err method? It would:

  1. Check if &self is a DataFusionError::Diagnostic
  2. If so, apply a closure to the wrapper error.

i.e. then you would do:

let err1 = exec_err!("need to download more RAM");
let err2 = err1.with_diagnostic(...);
let err3 = err2.map_diagnosticed_err(|err| 
    plan_err!("that went wrong {}", err);
);

And now err2 would be a Diagnostic(ExecError) and err3 would be a Diagnostic(PlanError).

I'm not sure if this would work or make for a good implementation, but I hope it can help in any way :)

@prowang01
Copy link

Hi! I'm currently preparing my GSoC 2025 application and would love to contribute to this issue as a warm-up task. I understand this one involves attaching a Diagnostic to the "wrong number of arguments" error, and I'm excited to explore how diagnostics are propagated in the codebase.

I saw the previous discussion and understand there are some challenges with layered errors, but I'm happy to give it a try and ask questions as I go. Would it be okay if I start working on this?

Thanks in advance!

@Chen-Yuan-Lai
Copy link
Contributor

Chen-Yuan-Lai commented Mar 27, 2025

Sorry, @prowang01. I have already been working on it, but if you have a more comprehensive implementation, feel free to work on it :)

@Chen-Yuan-Lai Chen-Yuan-Lai linked a pull request Mar 27, 2025 that will close this issue
@Chen-Yuan-Lai
Copy link
Contributor

@eliaperantoni, sorry for the long delay. I pushed a rough implementation for the issue, I noticed that you are working on the FnCallSpans feature (#15276), which would provide a more comprehensive solution for capturing span information in function calls.

Would it make sense to wait for that implementation to be completed? :)

@Chen-Yuan-Lai
Copy link
Contributor

And now err2 would be a Diagnostic(ExecError) and err3 would be a Diagnostic(PlanError).

I'm not sure if this would work or make for a good implementation, but I hope it can help in any way :)

This approach sounds great! but in this case, I still need to get argument numbers in the take_function_args (err1), so I finally just cloned the diagnostic and reattached it to the outer error. 😆

@eliaperantoni
Copy link
Contributor Author

@Chen-Yuan-Lai I only filed the ticket, but it's @jsai28 working on that feature :) You're definitely right though: these two tickets do have some overlap and we should make sure the resulting API makes sense overall.

In my personal view, #15276 is more about giving custom functions the ability to optionally implement a diagnose function and optionally use Diagnostics in invoke_with_args to make pretty errors.

But I think Datafusion should be able to return Diagnostic-ed errors for "wrong number of arguments" even for all existing functions whose Signature already brings the necessary information; without requiring them to implement diagnose.

For example, the log function already today returns an error if you call it with three arguments:

Image

The goal of this ticket is to make similar errors Diagnostic-ed.


I think this ticket actually overlaps more with #14431 than with #15276 so perhaps you and @dentiny could align on it? 😊

The difficulty lies in Signature::OneOf I think. For example log accepts [Float] and [Float, Float]. So the error Datafusion returns today for [Int] is simply that it didn't match any of the possibilities. While that's correct, perhaps it's not the best UX. Maybe that error message should be a fallback, more than anything.

Proposal:

  • If the actual arguments don't match the length of any possibility (e.g. log(1,2,3), then we error out with "log cannot take 3 arguments".
  • If the actual arguments do match in length one of the possibility, but the types are misaligned (e.g. log('foo')) then we error out with "log cannot take argument types [utf8]"

Does that make sense? Sorry I realize that this must be very confusing. In general it's going to be a bit difficult to map the concept of "arguments don't match any of the available signatures" to a nicer user message.

My proposal is not really a complete plan but more of a spark to ignite a discussion together with @Chen-Yuan-Lai and @dentiny so we can find a good solution ❤️

@dentiny
Copy link
Contributor

dentiny commented Mar 27, 2025

Thank you for the care ❤️
Let me sync with @Chen-Yuan-Lai offline

@Chen-Yuan-Lai
Copy link
Contributor

Chen-Yuan-Lai commented Apr 21, 2025

Hi @eliaperantoni, I spent some time investigating additional locations in the codebase where argument numbers for most SQL functions are checked. I've roughly classified them into four categories:

  1. data_types_with_scalar_udf, data_types_with_aggregate_udf, data_types_with_window_udf, data_types :

    • These provide the outer layer for checking zero arguments
      if current_types.is_empty() && type_signature != &TypeSignature::UserDefined {
      if type_signature.supports_zero_argument() {
      return Ok(vec![]);
      } else if type_signature.used_to_support_zero_arguments() {
      // Special error to help during upgrade: https://github.com/apache/datafusion/issues/13763
      return plan_err!("'{}' does not support zero arguments. Use TypeSignature::Nullary for zero arguments", func.name());
      } else {
      return plan_err!("'{}' does not support zero arguments", func.name());
      }
      }
  2. get_valid_types

    • Called by the functions mentioned above, with two approaches to check argument numbers in match branches:
      1. Checking separately (TypeSignature::Uniform, TypeSignature::VariadicAny, TypeSignature::Nullary, TypeSignature::Any)
        TypeSignature::Uniform(number, valid_types) => {
        if *number == 0 {

        TypeSignature::VariadicAny => {
        if current_types.is_empty() {
        return plan_err!(
        "Function '{function_name}' expected at least one argument but received 0"
        );

        TypeSignature::Nullary => {
        if !current_types.is_empty() {
        return plan_err!(
        "The function '{function_name}' expected zero argument but received {}",
        current_types.len()
        );

        TypeSignature::Any(number) => {
        if current_types.is_empty() {
        return plan_err!(
        "The function '{function_name}' expected at least one argument but received 0"
        );
        }
        if current_types.len() != *number {
        return plan_err!(
        "The function '{function_name}' expected {number} arguments but received {}",
        current_types.len()
        );
      2. Using the function_length_check() inner function (TypeSignature::String, TypeSignature::Numeric, TypeSignature::Comparable, TypeSignature::Coercible)
  3. try_coerce_types

    • Also called by the above functions, but after valid_types(). Signature variants that don't check argument numbers in the match branch in valid_types() will return errors here (e.g., TypeSignature::Exact, TypeSignature::OneOf).
  4. Implemented methods of ScalarUDFImpl and AggregateUDFImpl trait:

    • In these methods, when take_function_args is called, it also checks the argument count automatically. However, if take_function_args is not called, they may implement the argument count checking themself
      • return_type
      • return_type_from_args
      • invoke_with_args
      • simplify
      • coerce_types
      • ...

Based on these findings, I've discovered that the same argument checking logic is redundantly implemented in many places. Furthermore, some of these checks might never be triggered. For example, pi() has TypeSignature::Nullary and checks the argument number in both categories 1 and 2, but category 1 would return a zero argument error before category 2 is ever reached.

Therefore, I propose creating a new function to integrate checking logic for all signature variants in a single place. This approach offers several benefits:

  1. This function would be used in data_types_with_scalar_udf, data_types_with_aggregate_udf, data_types_with_window_udf, and data_types, enabling early argument count checking.
  2. We can attach diagnostic information in a single location.
  3. For signature variants that don't check argument counts and only rely on try_coerce_types for validation, this function can add checking logic to return appropriate errors.

Regarding your proposal:

The difficulty lies in Signature::OneOf I think. For example log accepts [Float] and [Float, Float]. So the error Datafusion returns today for [Int] is simply that it didn't match any of the possibilities. While that's correct, perhaps it's not the best UX. Maybe that error message should be a fallback, more than anything.

Proposal:

  • If the actual arguments don't match the length of any possibility (e.g. log(1,2,3), then we error out with "log cannot take 3 arguments".
  • If the actual arguments do match in length one of the possibility, but the types are misaligned (e.g. log('foo')) then we error out with "log cannot take argument types [utf8]"

I believe this new function can also handle the case of Signature::OneOf to check argument count without knowing all the possibilities. For type mismatch errors, these might be better implemented in try_coerce_types? maybe @dentiny can do this 😆

I've nearly completed this function in my draft PR, but I'd like your feedback on whether this approach makes sense. I'm happy to discuss more details or explore alternatives. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
5 participants