Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement flatten for MakeArray #7461

Closed
wants to merge 2 commits into from

Conversation

jayzhan211
Copy link
Contributor

Which issue does this PR close?

Split from #6796
The previous implementation of flatten #6995 is for Array only, we need one for Expr.

Closes #.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the logical-expr Logical plan and expressions label Sep 1, 2023
@jayzhan211 jayzhan211 marked this pull request as ready for review September 9, 2023 08:25
/// For example:
/// [[1, 2, 3], [4, 5, 6]] => [1, 2, 3, 4, 5, 6]
/// [[[1, 2], [3, 4]], [[5, 6], [7, 8]]] => [1, 2, 3, 4, 5, 6, 7, 8]
/// Panics if the expression cannot be flattened.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code doesn't seem to panic -- instead it ignores errors

/// [[1, 2, 3], [4, 5, 6]] => [1, 2, 3, 4, 5, 6]
/// [[[1, 2], [3, 4]], [[5, 6], [7, 8]]] => [1, 2, 3, 4, 5, 6, 7, 8]
/// Returns an error if the expression cannot be flattened.
pub fn try_flatten(&self) -> Result<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function appears to be trying to simplify the flatten function by partially evaluating a make_array. Given that I think this kind of logic is likely better in https://github.com/apache/arrow-datafusion/blob/main/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs or something similar.

In fact, I think since we have a flatten function already: https://github.com/apache/arrow-datafusion/blob/87527c430852af82ef182e1607804f6ddd817bbb/datafusion/physical-expr/src/array_expressions.rs#L1662-L1665

Such an expression will already be automatically handled by expression simplification

Copy link
Contributor Author

@jayzhan211 jayzhan211 Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is mainly used in the unnest function currently, and I don't think it is a helpful expression optimization, it is too specialized.

Not only that, in my current design for unnest, I will process them before the optimization phase, specifically in the planning phase. If I move this function into the optimization phase, I still need to process this function prior, and since only the unnest function needs this now, we do nothing in the optimization phase.

Only If this optimization is generally used, I think moving it to the optimization phase makes sense.

You can look to the PR for unnest #6796 too, it is ready for view actually. I planned to reopen it after the flatten function merged, but since this function is tightly bind to unnest function, maybe review them together is better.

@alamb
Copy link
Contributor

alamb commented Oct 4, 2023

This PR appears to be stalled -- what do you think we should do with it @jayzhan211 ?

@jayzhan211
Copy link
Contributor Author

This PR appears to be stalled -- what do you think we should do with it @jayzhan211 ?

Well, it depends on the review of #6796, since this PR is part of it.

@alamb
Copy link
Contributor

alamb commented Oct 30, 2023

Marking this as draft to make it clear it is not waiting on review. Sorry that this effort has stalled @jayzhan211 -- I think we need to work in simplifying the code / fixing what we have before adding more features

@alamb alamb marked this pull request as draft October 30, 2023 20:39
@jayzhan211 jayzhan211 closed this Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants