Skip to content

Backport: Fix incorrect searched CASE optimization, Fix UNION field nullability tracking #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 30, 2025

Conversation

findepi
Copy link
Collaborator

@findepi findepi commented Jan 30, 2025

This is backport snapshot of PRs:

Let's cherry pick those solutions to the fork to unblock further work.

This cherry pick a snapshot of a PR offered upstream.

------------------------------------------------------

There is an optimization for searched CASE where values are of boolean
type. It was converting the expression like

    CASE
        WHEN X THEN A
        WHEN Y THEN B
        ..
        [ ELSE D ]
    END

into

    (X AND A)
        OR (Y AND NOT X AND B)
        [ OR (NOT (X OR Y) AND D) ]

This had the following problems

- does not work for nullable conditions. If X is nullable, we cannot use
  NOT (X) to compliment it. We need to use `X IS DISTINCT FROM true`
- it does not work correctly when some conditions are nullable and other
  values are false. E.g. X=NULL, A=true, Y=NULL, B=true, D=false, the
  CASE should return false, but the boolean expression will simplify to
  `(NULL AND ..) OR (NULL AND ..) OR (false)` which is NULL, not false
  - thus we use `X` for truthness check of `X`, we need to test `X IS
    NOT DISTINCT FROM true`
- it did not work correctly when default D is missing, but conditions
  do not evaluate to NULL. CASE's result should be NULL but was false.

This commit fixes that optimization.
This cherry pick a snapshot of a PR offered upstream.

------------------------------------------------------

This commit fixes two bugs related to UNION handling

- when constructing union plan nullability of the other union branch was
  ignored, thus resulting field could easily have incorrect nullability
- when pruning/simplifying projects, in `recompute_schema` function
  there was similar logic, thus loosing nullability information even for
  correctly constructed Union plan node

As a result, other optimizer logic (e.g. `expr_simplifier.rs`) could
draw incorrect conclusions and thus lead to incorrect query results, as
demonstrated with the attached SLT test.
@findepi findepi merged commit 958cb3c into sdf/43 Jan 30, 2025
46 checks passed
@findepi findepi deleted the findepi/sdf/43/case-null-union-null branch January 30, 2025 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants