-
Notifications
You must be signed in to change notification settings - Fork 1.5k
custom_datasource
example panicked during RepartitionExec
planning
#15493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I saw similar error messages when running tpch sqllogictest in the latest main branch
|
I guess #15476 is the root cause. When I checked out 9071503, the tests passed. ❌ 14635da (HEAD -> main, origin/main, origin/HEAD, goldmedal/main) perf: Reuse row converter during sort (#15302) |
By the way, the fail example is |
Interesting, there were also some benchmark fails, it show the same error with the same line in: panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22: cargo bench -p datafusion --bench topk_aggregate --profile release-nonlto
Finished `release-nonlto` profile [optimized] target(s) in 0.34s
Running benches/topk_aggregate.rs (target/release-nonlto/deps/topk_aggregate-cbbaaf4e04209381)
Gnuplot not found, using plotters backend
Benchmarking aggregate 10000000 time-series rows: Warming up for 3.0000 s
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22:
partition not used yet
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22:
partition not used yet
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22:
partition not used yet
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22:
partition not used yet
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22:
partition not used yet
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22:
partition not used yet |
Interesting,I just noticed that the CI also failed at this point |
Also the tpch benchmark fails for the main branch: RUST_BACKTRACE=1 ./bench.sh run tpch10
***************************
DataFusion Benchmark Script
COMMAND: run
BENCHMARK: tpch10
DATAFUSION_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/..
BRANCH_NAME: main
DATA_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/data
RESULTS_DIR: /Users/zhuqi/arrow-datafusion/benchmarks/results/main
CARGO_COMMAND: cargo run --release
PREFER_HASH_JOIN: true
***************************
RESULTS_FILE: /Users/zhuqi/arrow-datafusion/benchmarks/results/main/tpch_sf10.json
Running tpch benchmark...
Finished `release` profile [optimized] target(s) in 0.20s
Running `/Users/zhuqi/arrow-datafusion/target/release/tpch benchmark datafusion --iterations 5 --path /Users/zhuqi/arrow-datafusion/benchmarks/data/tpch_sf10 --prefer_hash_join true --format parquet -o /Users/zhuqi/arrow-datafusion/benchmarks/results/main/tpch_sf10.json`
Running benchmarks with the following options: RunOpt { query: None, common: CommonOpt { iterations: 5, partitions: None, batch_size: 8192, mem_pool_type: "fair", memory_limit: None, sort_spill_reservation_bytes: None, debug: false }, path: "/Users/zhuqi/arrow-datafusion/benchmarks/data/tpch_sf10", file_format: "parquet", mem_table: false, output_path: Some("/Users/zhuqi/arrow-datafusion/benchmarks/results/main/tpch_sf10.json"), disable_statistics: false, prefer_hash_join: true }
Query 1 iteration 0 took 690.9 ms and returned 4 rows
Query 1 iteration 1 took 598.7 ms and returned 4 rows
Query 1 iteration 2 took 601.4 ms and returned 4 rows
Query 1 iteration 3 took 581.8 ms and returned 4 rows
Query 1 iteration 4 took 637.5 ms and returned 4 rows
Query 1 avg time: 622.06 ms
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22:
partition not used yet
stack backtrace:
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread 'tokio-runtime-worker' panicked at datafusion/physical-plan/src/repartition/mod.rs:618:22:
partition not used yet
stack backtrace:
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. |
I'm quite curious as to why the PR that caused the panic can pass the CI. |
I think its branch https://github.com/ctsk/datafusion/tree/remove-hj-coalesce is based on the old main branch (about 2 weeks ago). Then, #15476 was created 2 days ago, some PRs were merged after its CI passed. Not sure which one is conflicting with it. |
Revert the #15476 Just tested, also fixed the tpch bench. |
Describe the bug
I have a PR that didn't change the repartition code, but caused one assertion failure inside
RepartitionExec
'sexecute()
method, duringcustom_datasource.rs
example's execution.The failed CI job run: https://github.com/apache/datafusion/actions/runs/14152369014/job/39647355093
After re-running, the CI it passed, this might be some heisenbug which occurs rarely?
To Reproduce
No response
Expected behavior
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: