-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
panicked at 'index out of bounds: the len is 0 but the index is 0' in datafusion::physical_plan::projection::validate_output_ordering
#7482
Comments
I wonder if we could improve this situation by encapsulating the creation of OutputOrdering so that |
BTW I found another workaround in IOx. We had configured a let base_config = FileScanConfig {
object_store_url: self.object_store_url.clone(),
file_schema: schema,
file_groups: vec![vec![PartitionedFile {
object_meta: self.object_meta.clone(),
partition_values: vec![],
range: None,
extensions: None,
}]],
statistics: Statistics::default(),
projection: None,
limit: None,
table_partition_cols: vec![],
// Parquet files ARE actually sorted but we don't care here since we just construct a `collect` plan.
output_ordering: vec![],
infinite_source: false,
}; I could stop the crashes like this: diff --git a/parquet_file/src/storage.rs b/parquet_file/src/storage.rs
index 285e272f7..c520e3bd0 100644
--- a/parquet_file/src/storage.rs
+++ b/parquet_file/src/storage.rs
@@ -137,7 +137,7 @@ impl ParquetExecInput {
limit: None,
table_partition_cols: vec![],
// Parquet files ARE actually sorted but we don't care here since we just construct a `collect` plan.
- output_ordering: vec![vec![]],
+ output_ordering: vec![],
infinite_source: false,
};
let exec = ParquetExec::new(base_config, None, None); I do think this illustrates why having a dedicated structure to encapsulate the output orderings might be nice. But definitely not necessary |
Thanks -- I don't think the fix would be high priority (given I have fixed IOx tests not to create this type of ordering it is not blocking anything in my mind) In general, I think a better use of time would be to change the API so that it is not possible to hit this So instead of One way might be to do something like this: fn output_ordering(&self) -> &OutputOrdering { ..}
struct OutputOrdering {
exprs: Vec<PhysicalSortExpr>
}
impl OutputOrdering {
fn append(&mut self, order: PhysicalSortExpr) {
if !order.is_empty() {
self.exprs.push(order)
}
} |
Note: the issues is present in I could not figure out a workaround for my case ( It's unfortunate that for a 2-month-old crash there is no patch release, but I will test against Edit: |
Closing this issue as it was fixed in 33.0.0. Thanks @sergiimk |
Describe the bug
Similarly to #7418 when I am working on an upgrade to DataFusion in IOx (https://github.com/influxdata/influxdb_iox/pull/8577) we get a failure like this:
To Reproduce
No response
Expected behavior
No response
Additional context
I believe it is failing in
validate_output_ordering
(a different function added in #7364 by @berkaysynnada (looks like it needs a check for empty).Clearly our coverage is not good in this regard (aka I am not sure what it is about IOx that it is creating plans that report output ordering with empty ordering
The text was updated successfully, but these errors were encountered: