Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Optimizer Sanity Checker, improve sortedness equivalence properties #11196

Merged
merged 34 commits into from
Jul 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
8f0c30f
Initial optimizer sanity checker.
yfy- Jun 9, 2024
2538702
Add distro and pipeline friendly checks
yfy- Jun 11, 2024
aea95c6
Also check the plans we create are correct.
yfy- Jun 12, 2024
9aef599
Add distribution test cases using global limit exec.
yfy- Jun 13, 2024
49aba4b
Add test for multiple children using SortMergeJoinExec.
yfy- Jun 15, 2024
d89b1c9
Move PipelineChecker to SanityCheckPlan
yfy- Jun 16, 2024
defebb0
Fix some tests and add docs
yfy- Jun 16, 2024
b392cf6
Add some test docs and fix clippy diagnostics.
yfy- Jun 17, 2024
5bdec62
Fix some failing tests
mustafasrepo Jun 20, 2024
46bf900
Merge branch 'apache_main' into optimizer-sanity-checker
yfy- Jun 20, 2024
7148c07
Replace PipelineChecker with SanityChecker in .slt files.
yfy- Jun 21, 2024
1e888c4
Initial commit
mustafasrepo Jun 25, 2024
c08a92b
Slt tests pass
mustafasrepo Jun 25, 2024
6d79881
Resolve linter errors
mustafasrepo Jun 25, 2024
cd49e2e
Minor changes
mustafasrepo Jun 25, 2024
338f451
Minor changes
mustafasrepo Jun 25, 2024
1959488
Minor changes
mustafasrepo Jun 25, 2024
1c5c0c0
Minor changes
mustafasrepo Jun 25, 2024
24e32b6
Sort PreservingMerge clear per partition
mustafasrepo Jun 25, 2024
9b77608
Minor changes
mustafasrepo Jun 26, 2024
655a010
Merge branch 'optimizer-sanity-checker' into sanity_checker_subs
mustafasrepo Jun 26, 2024
e93cf6e
Update output_requirements.rs
berkaysynnada Jun 26, 2024
58cbf9a
Address reviews
mustafasrepo Jun 27, 2024
6f8db29
Merge branch 'apache_main' into optimizer-sanity-checker
mustafasrepo Jun 27, 2024
0cb86b9
Merge branch 'optimizer-sanity-checker' into sanity_checker_subs
mustafasrepo Jun 27, 2024
5a34ae0
Update datafusion/core/src/physical_optimizer/optimizer.rs
mustafasrepo Jul 1, 2024
f3ac546
Update datafusion/core/src/physical_optimizer/sanity_checker.rs
mustafasrepo Jul 1, 2024
0491c13
Address reviews
mustafasrepo Jul 1, 2024
aa075e2
Merge branch 'apache_main' into optimizer-sanity-checker
mustafasrepo Jul 1, 2024
dfd219c
Minor changes
mustafasrepo Jul 1, 2024
0c2eb45
Merge pull request #24 from yfy-/optimizer-sanity-checker
mustafasrepo Jul 1, 2024
d8ebc14
Apply suggestions from code review
mustafasrepo Jul 3, 2024
747b69b
Update comment
mustafasrepo Jul 3, 2024
aa1382b
Add map implementation
mustafasrepo Jul 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion datafusion/core/src/physical_optimizer/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,10 @@ pub mod join_selection;
pub mod limited_distinct_aggregation;
pub mod optimizer;
pub mod output_requirements;
pub mod pipeline_checker;
pub mod projection_pushdown;
pub mod pruning;
pub mod replace_with_order_preserving_variants;
pub mod sanity_checker;
mod sort_pushdown;
pub mod topk_aggregation;
pub mod update_aggr_exprs;
Expand Down
16 changes: 10 additions & 6 deletions datafusion/core/src/physical_optimizer/optimizer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ use crate::physical_optimizer::enforce_sorting::EnforceSorting;
use crate::physical_optimizer::join_selection::JoinSelection;
use crate::physical_optimizer::limited_distinct_aggregation::LimitedDistinctAggregation;
use crate::physical_optimizer::output_requirements::OutputRequirements;
use crate::physical_optimizer::pipeline_checker::PipelineChecker;
use crate::physical_optimizer::sanity_checker::SanityCheckPlan;
use crate::physical_optimizer::topk_aggregation::TopKAggregation;
use crate::{error::Result, physical_plan::ExecutionPlan};

Expand Down Expand Up @@ -124,11 +124,15 @@ impl PhysicalOptimizer {
// are not present, the load of executors such as join or union will be
// reduced by narrowing their input tables.
Arc::new(ProjectionPushdown::new()),
// The PipelineChecker rule will reject non-runnable query plans that use
// pipeline-breaking operators on infinite input(s). The rule generates a
// diagnostic error message when this happens. It makes no changes to the
// given query plan; i.e. it only acts as a final gatekeeping rule.
Arc::new(PipelineChecker::new()),
// The SanityCheckPlan rule checks whether the order and
// distribution requirements of each node in the plan
// is satisfied. It will also reject non-runnable query
// plans that use pipeline-breaking operators on infinite
// input(s). The rule generates a diagnostic error
// message for invalid plans. It makes no changes to the
// given query plan; i.e. it only acts as a final
// gatekeeping rule.
Arc::new(SanityCheckPlan::new()),
];

Self::with_rules(rules)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,9 @@ fn require_top_ordering_helper(
if children.len() != 1 {
Ok((plan, false))
} else if let Some(sort_exec) = plan.as_any().downcast_ref::<SortExec>() {
let req_ordering = sort_exec.properties().output_ordering().unwrap_or(&[]);
// In case of constant columns, output ordering of SortExec would give an empty set.
// Therefore; we check the sort expression field of the SortExec to assign the requirements.
let req_ordering = sort_exec.expr();
let req_dist = sort_exec.required_input_distribution()[0].clone();
let reqs = PhysicalSortRequirement::from_sort_exprs(req_ordering);
Ok((
Expand Down
334 changes: 0 additions & 334 deletions datafusion/core/src/physical_optimizer/pipeline_checker.rs

This file was deleted.

Loading