Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix incorrect execution plan when generated column rewrite in join relation if left table and right table has the same column name #52584

Merged

Conversation

srlch
Copy link
Contributor

@srlch srlch commented Nov 4, 2024

Why I'm doing:

This problem is introduced by pr #50398. In this pr, we introduce some new rule for generated column rewriting. The basic idea is following:

  1. Collect the rewriting relation in every SELECT scope in query. (Expr -> SlotRef)
  2. Translate the expression relation into Operator mapping: ScalarOperator -> ColumnRefOperator
  3. Introduce new rule says ReplaceScalarOperatorRule, use this new rule to replace the ScalarOperator by ColumnRefOperator when generating the logical plan in optimizer.

This problem is that, ReplaceScalarOperatorRule use ScalarOperator.isEquivalent to check if a ScalarOperator hit the rule instead of using ScalarOperator.equals. ScalarOperator.isEquivalent does not check the operator id but this id will be used to identify the column with the same column name but come from different table in JOIN relation. (e.g column xx in TABLE A and column xx in TABLE B has same name but different id, in this case, ScalarOperator.isEquivalent return true but ScalarOperator.equals return false). So in this case, we will get the wrong mapping and generated a incorrect plan for generated column rewrite.

What I'm doing:

  1. Using ScalarOperator.equals instead
  2. Introduce session variables disable_generated_column_rewrite for disable the generated column rewrite if we want.

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

…relation if left table and right table has the same column name

Why I'm doing:
This problem is introduced by pr StarRocks#50398. In this pr, we introduce some new rule for generated column rewriting.
The basic idea is following:
1. Collect the rewriting relation in every SELECT scope in query. (Expr -> ColumnRef)
2. Translate the expression relation into Operator mapping: ScalarOperator -> ColumnRefOperator
3. Introduce new rule says ReplaceScalarOperatorRule, use this new rule to replace the ScalarOperator
   into ColumnRefOperator when generating the logical plan in optimizer.

This problem is that, ReplaceScalarOperatorRule use ScalarOperator.isEquivalent to check if a ScalarOperator
is hit the rule instead of using ScalarOperator.equals. ScalarOperator.isEquivalent does not check the operator
id but this id will be used to indentify the column with the same column name but come from different table in JOIN
relation. (e.g column xx in TABLE A and column xx in TABLE B has same name but different id, in this case, ScalarOperator.isEquivalent
return true but ScalarOperator.equals return false). So in this case, we will get the wrong mapping and generated a
incorrent plan for generated column rewrite.

What I'm doing:
1. Using ScalarOperator.equals instead
2. Introduce session variables disable_generated_column_rewrite for disable the generated column rewrite if we want.

Signed-off-by: srlch <[email protected]>
@srlch srlch requested review from a team as code owners November 4, 2024 09:51
@mergify mergify bot assigned srlch Nov 4, 2024
public void setDisableGeneratedColumnRewrite(boolean disableGeneratedColumnRewrite) {
this.disableGeneratedColumnRewrite = disableGeneratedColumnRewrite;
}

public int getConnectorIncrementalScanRangeNumber() {
return connectorIncrementalScanRangeSize;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
The DISABLE_GENERATED_COLUMN_REWRITE variable may not be accounted for in the method that applies rewrites. If other flags interact with rewrite behavior, this oversight could cause unexpected issues.

You can modify the code like this:

// Assuming there is a method where rewrite decisions are applied.
// We need to ensure DISABLE_GENERATED_COLUMN_REWRITE is checked and used to control logic,
// particularly in any code path that performs or skips rewrites.
public void applyRewriteRules() {
    if (isDisableGeneratedColumnRewrite()) {
        // Add logic to skip generated column rewrite here
    } else {
        // Existing rewrite logic
    }
}

Ensure that wherever rewrite rules are applied, you include logic to handle the disableGeneratedColumnRewrite flag.

@@ -30,7 +30,7 @@ public ReplaceScalarOperatorRule(Map<ScalarOperator, ColumnRefOperator> translat
@Override
public ScalarOperator visit(ScalarOperator scalarOperator, ScalarOperatorRewriteContext context) {
for (Map.Entry<ScalarOperator, ColumnRefOperator> m : translateMap.entrySet()) {
if (ScalarOperator.isEquivalent(m.getKey(), scalarOperator)) {
if (m.getKey().equals(scalarOperator)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use Map.contains

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Signed-off-by: srlch <[email protected]>
murphyatwork
murphyatwork previously approved these changes Nov 4, 2024
Seaven
Seaven previously approved these changes Nov 4, 2024
Signed-off-by: srlch <[email protected]>
@srlch srlch dismissed stale reviews from Seaven and murphyatwork via d3d513b November 4, 2024 14:21
Signed-off-by: srlch <[email protected]>
Signed-off-by: srlch <[email protected]>
Copy link

sonarcloud bot commented Nov 5, 2024

@srlch srlch changed the title [BugFix] Fix incorrect execution plan when generated rewrite in join relation if left table and right table has the same column name [BugFix] Fix incorrect execution plan when generated column rewrite in join relation if left table and right table has the same column name Nov 5, 2024
Copy link

github-actions bot commented Nov 5, 2024

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

github-actions bot commented Nov 5, 2024

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

github-actions bot commented Nov 5, 2024

[FE Incremental Coverage Report]

pass : 6 / 6 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/sql/optimizer/rewrite/scalar/ReplaceScalarOperatorRule.java 2 2 100.00% []
🔵 com/starrocks/qe/SessionVariable.java 2 2 100.00% []
🔵 com/starrocks/sql/analyzer/QueryAnalyzer.java 2 2 100.00% []

@murphyatwork murphyatwork merged commit af20339 into StarRocks:main Nov 5, 2024
69 checks passed
Copy link

github-actions bot commented Nov 5, 2024

@Mergifyio backport branch-3.3

@github-actions github-actions bot removed the 3.3 label Nov 5, 2024
Copy link

github-actions bot commented Nov 5, 2024

@Mergifyio backport branch-3.2

@github-actions github-actions bot removed the 3.2 label Nov 5, 2024
Copy link
Contributor

mergify bot commented Nov 5, 2024

backport branch-3.3

✅ Backports have been created

Copy link
Contributor

mergify bot commented Nov 5, 2024

backport branch-3.2

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Nov 5, 2024
…n join relation if left table and right table has the same column name (#52584)

Signed-off-by: srlch <[email protected]>
(cherry picked from commit af20339)

# Conflicts:
#	fe/fe-core/src/main/java/com/starrocks/qe/SessionVariable.java
mergify bot pushed a commit that referenced this pull request Nov 5, 2024
…n join relation if left table and right table has the same column name (#52584)

Signed-off-by: srlch <[email protected]>
(cherry picked from commit af20339)

# Conflicts:
#	fe/fe-core/src/main/java/com/starrocks/qe/SessionVariable.java
srlch added a commit to srlch/starrocks that referenced this pull request Nov 5, 2024
…n join relation if left table and right table has the same column name (StarRocks#52584)

Signed-off-by: srlch <[email protected]>
srlch added a commit to srlch/starrocks that referenced this pull request Nov 5, 2024
…n join relation if left table and right table has the same column name (StarRocks#52584)

Signed-off-by: srlch <[email protected]>
Seaven pushed a commit that referenced this pull request Nov 6, 2024
…n join relation if left table and right table has the same column name (backport #52584) (#52643)

Signed-off-by: srlch <[email protected]>
Seaven pushed a commit that referenced this pull request Nov 6, 2024
…n join relation if left table and right table has the same column name (backport #52584) (#52644)

Signed-off-by: srlch <[email protected]>
renzhimin7 pushed a commit to renzhimin7/starrocks that referenced this pull request Nov 7, 2024
…n join relation if left table and right table has the same column name (StarRocks#52584)

Signed-off-by: srlch <[email protected]>
Signed-off-by: zhiminr.ren <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants