You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Apply pre-selection and computation skipping to short-circuit optimization (#15694)
* Enhance short-circuit evaluation for binary expressions
- Delay evaluation of the right-hand side (RHS) unless necessary.
- Optimize short-circuiting for `Operator::And` and `Operator::Or` by checking LHS alone first.
- Introduce `get_short_circuit_result` function to determine short-circuit conditions based on LHS and RHS.
- Update tests to cover various short-circuit scenarios for both `AND` and `OR` operations.
* refactor: rename test_check_short_circuit to test_get_short_circuit_result and update assertions
- Renamed the test function for clarity.
- Updated assertions to use get_short_circuit_result instead of check_short_circuit.
- Added additional test cases for AND and OR operations with expected results.
* fix: enhance short-circuit evaluation logic in get_short_circuit_result function for null
- Updated AND and OR short-circuit conditions to only trigger when all values are either false or true, respectively, and there are no nulls in the array.
- Adjusted test case to reflect the change in expected output.
* feat: add debug logging for binary expression evaluation and short-circuit checks
* fix: improve short-circuit evaluation logic in BinaryExpr to ensure RHS is only evaluated when necessary
* fix: restrict short-circuit evaluation to logical operators in get_short_circuit_result function
* add more println!("==> ");
* fix: remove duplicate data type checks for left and right operands in BinaryExpr evaluation
* feat: add debug prints for dictionary values and keys in binary expression tests
* Tests pass
* fix: remove redundant short-circuit evaluation check in BinaryExpr and enhance documentation for get_short_circuit_result
* refactor: remove unnecessary debug prints and streamline short-circuit evaluation in BinaryExpr
* test: enhance short-circuit evaluation tests for nullable and scalar values in BinaryExpr
* add benchmark
* refactor: improve short-circuit logic in BinaryExpr for logical operators
- Renamed `arg` to `lhs` for clarity in the `get_short_circuit_result` function.
- Updated handling of Boolean data types to return `None` for null values.
- Simplified short-circuit checks for AND/OR operations by consolidating logic.
- Enhanced readability and maintainability of the code by restructuring match statements.
* refactor: enhance short-circuit evaluation strategy in BinaryExpr to optimize logical operations
* Revert "refactor: enhance short-circuit evaluation strategy in BinaryExpr to optimize logical operations"
This reverts commit a62df47.
* bench: add benchmark for OR operation with all false values in short-circuit evaluation
* refactor: add ShortCircuitStrategy enum to optimize short-circuit evaluation in BinaryExpr
- Replaced the lazy evaluation of the right-hand side (RHS) with immediate evaluation based on short-circuiting logic.
- Introduced a new function `check_short_circuit` to determine if short-circuiting can be applied for logical operators.
- Updated the logic to return early for `Operator::And` and `Operator::Or` based on the evaluation of the left-hand side (LHS) and the conditions of the RHS.
- Improved clarity and efficiency of the short-circuit evaluation process by eliminating unnecessary evaluations.
* refactor: simplify short-circuit evaluation logic in check_short_circuit function
* datafusion_expr::lit as expr_lit
* refactor: optimize short-circuit evaluation in check_short_circuit function
- Simplified logic for AND/OR operations by prioritizing false/true counts to enhance performance.
- Updated documentation to reflect changes in array handling techniques.
* refactor: add count_boolean_values helper function and optimize check_short_circuit logic
- Introduced a new helper function `count_boolean_values` to count true and false values in a BooleanArray, improving readability and performance.
- Updated `check_short_circuit` to utilize the new helper function for counting, reducing redundant operations and enhancing clarity in the evaluation logic for AND/OR operations.
- Adjusted comments for better understanding of the short-circuiting conditions based on the new counting mechanism.
* Revert "refactor: add count_boolean_values helper function and optimize check_short_circuit logic"
This reverts commit e2b9f77.
* optimise evaluate
* optimise evaluate 2
* refactor op:AND, lhs all false op:OR, lhs all true to be faster
* fix clippy warning
* refactor: optimize short-circuit evaluation logic in check_short_circuit function
* fix clippy warning
* add pre selection
* add some comments
* [WIP] fix pre-selection result
* fix: Error in calculating the ratio
* fix: Correct typo in pre-selection threshold constant and improve pre-selection scatter function documentation
* fix doctest error
* fix cargo doc
* fix cargo doc
* test: Add unit tests for pre_selection_scatter function
---------
Co-authored-by: Siew Kam Onn <[email protected]>
0 commit comments