-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING) #2530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -38,3 +38,8 @@ exclude = ["ballista-cli", "datafusion-cli"] | |||
[profile.release] | |||
codegen-units = 1 | |||
lto = true | |||
|
|||
[patch.crates-io] | |||
arrow = { git = "https://github.com/apache/arrow-rs.git", rev="5b154ea40314dc2f09babbb363bf7f1fe439d4eb" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right after apache/arrow-rs#1682
Likely related #2453 - the DataFusion logic for handling column projection to parquet is currently silently broken and likely only working because of the schema adapter logic |
🤔 I suppose we'll have to fix datafusion then... |
I don't understand how things can be broken but also be working... |
I've not had time to look properly yet, but my suspicion is that the schema adapter logic knows what the expected output schema is and rearranges the columns - masking the fact what was returned by the parquet reader did not respect the projection order. |
Filed #2543 to track |
This PR demonstrates a test that fails with the code from apache/arrow-rs#1682 in arrow (not included in arrow 14.0.0).
This PR pins datafusion to arrow right after apache/arrow-rs#1682 was merged at commit apache/arrow-rs@5b154ea
To reproduce:
cargo test -p datafusion --lib
Results in: