-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnest with single expression #9069
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is really nice initiative, thanks @jayzhan211, I'll review it this week if no one else beats me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jayzhan211 -- this PR looks awesome -- thank you.
I think it needs a few more tests and avoiding a panic, but otherise 👨🍳 👌
Thanks again.
datafusion/sql/src/select.rs
Outdated
.build() | ||
} else { | ||
// Only support single unnest expression for now | ||
assert_eq!(array_exprs_to_unnest.len(), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please make this return NotYetImplemented
rather than panic?
6 | ||
|
||
statement ok | ||
drop table unnest_table; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome -- can you please also add tests for:
- to unnest two columns (it is fine to error, but I think that behavior should be tested)
- trying to unnest a scalar value
1
rather than[1]
- to unnest zero columns (again should be an error)
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
Signed-off-by: jayzhan211 <[email protected]>
ee3dd80
to
eaf69fa
Compare
}) = exprs[0] | ||
{ | ||
// valid | ||
} else if let Expr::Column(_) = exprs[0] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arguments handling in Column seems not possible to handle anywhere before transforming to LogicalPlan. We should validate arguments inside UnnestExec or elsewhere downstream.
Signed-off-by: jayzhan211 <[email protected]>
@@ -124,6 +124,9 @@ fn physical_name(e: &Expr) -> Result<String> { | |||
|
|||
fn create_physical_name(e: &Expr, is_first_expr: bool) -> Result<String> { | |||
match e { | |||
Expr::Unnest(_) => { | |||
internal_err!("Expr::Unnest should have been converted to LogicalPlan::Unest") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal_err!("Expr::Unnest should have been converted to LogicalPlan::Unest") | |
internal_err!("Expr::Unnest should have been converted to LogicalPlan::Unnest") |
@@ -82,6 +82,13 @@ impl ExprSchemable for Expr { | |||
Expr::Case(case) => case.when_then_expr[0].1.get_type(schema), | |||
Expr::Cast(Cast { data_type, .. }) | |||
| Expr::TryCast(TryCast { data_type, .. }) => Ok(data_type.clone()), | |||
Expr::Unnest(Unnest { exprs }) => { | |||
let arg_data_types = exprs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need iter here as we refer for the first element only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can keep this, since there will be multiple exprs in the future
return not_impl_err!("unnest() does not support struct yet"); | ||
} else { | ||
return plan_err!( | ||
"unnest() can only be applied to array and structs and null" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line says unnest supports struct but line above says the opposite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first line is struct nyi error.
second line is to tell you that only those 3 are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats the confusing part. not_impl_err!("unnest() does not support struct yet");
- here the user can get the struct is not supported by unnest.
plan_err!("unnest() can only be applied to array and structs and null")
here the message says that unnest support ONLY array, structs and nulls which confronts with first err message
Maybe I'm missing something?
([6], [11,12], 3) | ||
; | ||
|
||
query I |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add some test description
---- | ||
|
||
query ? | ||
select column1 from unnest_table; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this test?
statement ok | ||
CREATE TABLE unnest_table | ||
AS VALUES | ||
([1,2,3], [7], 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please include nulls, empty arrays into the table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added nulls in 79eb32d to keep this PR moving
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jayzhan211 I left some comments, nice job
Back in the day I tried to implement unnest but went to wrong direction through builtin function and eventually gave up. Your approach with expression is good
Signed-off-by: jayzhan211 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jayzhan211 -- I think this PR is now ready to go. It looks like @comphead has a few more test suggestions in https://github.com/apache/arrow-datafusion/pull/9069/files#r1474726470 but I think we could also potentially add those tests as a follow on PR given this PR does not seem have special handling for them
statement ok | ||
CREATE TABLE unnest_table | ||
AS VALUES | ||
([1,2,3], [7], 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added nulls in 79eb32d to keep this PR moving
Thanks again @jayzhan211 -- this is really nice 👌 |
Which issue does this PR close?
Revisit from #6796, only single expression included in this PR.
Closes #.
Rationale for this change
Ref #6555
What changes are included in this PR?
Two types of unnest expressions
select unnest([1,2,3])
select unnest(column)
Are these changes tested?
Are there any user-facing changes?