Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify the returned type in TypeSignature::get_possible_types #13371

Open
jayzhan211 opened this issue Nov 12, 2024 · 5 comments
Open

Simplify the returned type in TypeSignature::get_possible_types #13371

jayzhan211 opened this issue Nov 12, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@jayzhan211
Copy link
Contributor

jayzhan211 commented Nov 12, 2024

Is your feature request related to a problem or challenge?

get_possible_types returns Vec<Vec<DataType>> and convert to string for information schema.

Can we instead return the type define in each function as part of the documentation and return that instead?

We add return type and argument type in Documentation struct.

pub struct Documentation {
    // arg name, arg type, description
    pub arguments: Option<ArgumentDescription>,
    pub return_type: String,
}

pub struct ArgumentDescription {
   name: String,
   type: String,
   description: String,
}

For example, if one function signature and return type is func(string, int) -> int.

Instead of returning bunch of possible DataType combination like (Utf8, Int64), (Utf8View, Int64), (Utf8, Int32) ....

We can return (string, int) and int for simplicity

@goldmedal @Omega359 What do you think?

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

In postgres, they have data type for argument and return type too. And in more general term like numeric, text, int ...

The overall idea is I want the simple argument type and return type defined in documentation and take this information for information schema.

Screenshot 2024-11-12 at 12 00 03 PM
@jayzhan211 jayzhan211 added the enhancement New feature or request label Nov 12, 2024
@goldmedal
Copy link
Contributor

I prefer to provide a real type name here instead of a semantic name (e.g. string, int). The similar behavior in DuckDB like

D select function_name, parameter_types from duckdb_functions() where function_name = 'range';
┌───────────────┬────────────────────────────────────────────────────────────────┐
│ function_name │                        parameter_types                         │
│    varchar    │                           varchar[]                            │
├───────────────┼────────────────────────────────────────────────────────────────┤
│ range         │ [BIGINT]                                                       │
│ range         │ [BIGINT, BIGINT]                                               │
│ range         │ [BIGINT, BIGINT, BIGINT]                                       │
│ range         │ [TIMESTAMP, TIMESTAMP, INTERVAL]                               │
│ range         │ [TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH TIME ZONE, INTERVAL] │
│ range         │ [BIGINT]                                                       │
│ range         │ [BIGINT, BIGINT]                                               │
│ range         │ [BIGINT, BIGINT, BIGINT]                                       │
│ range         │ [TIMESTAMP, TIMESTAMP, INTERVAL]                               │
│ range         │ [TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH TIME ZONE, INTERVAL] │
├───────────────┴────────────────────────────────────────────────────────────────┤
│ 10 rows                                                              2 columns │
└────────────────────────────────────────────────────────────────────────────────┘

Because we have the logic type now, maybe showing the logic type name would be a better choice. WDYT?
https://github.com/apache/datafusion/blob/main/datafusion/common/src/types/native.rs

@Omega359
Copy link
Contributor

Odd that duckdb seems to be repeating the list there but I'd love to have that list for all functions in DF

@jayzhan211
Copy link
Contributor Author

Because we have the logic type now, maybe showing the logic type name would be a better choice. WDYT?

I prefer Logical Type

@Omega359
Copy link
Contributor

I love the idea of this, I'm not sure logical type will work for all functions, especially those that are user defined. It may be worth if to go through all the functions and build up a spreadsheet of what input/output types they have. I've been wanting to have a more detailed list of functions and their details anyways so if I get a chance I may do that

@jayzhan211
Copy link
Contributor Author

For user defined signature, they should provide their argument and return type, so we can just get the info from it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants