feat: make `ExpressionHandler::get_evaluator` fallible #577

cg-cognition · 2024-12-07T22:25:22Z

Fixes #566 . Makes get_evaluator() return DeltaResult<Arc> to properly handle potential errors. This change is intended to support get_evaluator() doing eager validation checking, but doesn't do the validation itself.

What changes are proposed in this pull request?

Here's a comprehensive summary of all changes made:

Core Interface Change (lib.rs):

Modified ExpressionHandler trait to make get_evaluator return DeltaResult<Arc>
This change makes error handling explicit at the trait level

Implementation Changes (arrow_expression.rs):

Updated ArrowExpressionHandler to return Ok(Arc::new(...))
Simple change since this implementation can't fail

Error Handling Improvements (data_skipping.rs):

Simplified error handling using ok()? operator
Made the code more idiomatic while maintaining error propagation
Applied to select_stats_evaluator, skipping_evaluator, and filter_evaluator

Iterator Changes (log_replay.rs):

Changed to boxed iterator to handle both success and error cases
Added explicit error handling path using iter::once(Err(e))
Maintains original behavior while properly propagating errors

Call Site Updates:

Updated all get_evaluator calls to handle Result type
Modified in transaction.rs, scan/mod.rs, and table_changes/log_replay.rs
Added proper error propagation using ? operator

How was this change tested?

Tested with cargo test --all-features --all-targets -- --skip read_table_version_hdfs. Had an issue with getting Java setup but that test doesn't seem relevant to the PR. No new tests need to be written since it's a no-op refactor.

This PR affects the following public APIs

Allows get_evaluator to return a Result type, which opens up the possibility to do eager schema validation.

Additional Context

This PR was entirely written by Devin with a little bit of review from me. Happy to address any feedback and get this over the finish line.

Original Run: https://preview.devin.ai/sessions/e839a4e9bcb444b69b99205babdcf1af

Make get_evaluator() return DeltaResult<Arc<dyn ExpressionEvaluator>> to properly handle potential errors. Update all call sites to handle the Result type and simplify error handling using the ok()? operator where appropriate.

kernel/src/scan/data_skipping.rs

kernel/src/scan/log_replay.rs

…tions - Modified scan_action_iter to return DeltaResult<impl Iterator> - Fixed code formatting in log_replay.rs files - Reordered imports in mod.rs for better readability - Improved error propagation in transaction.rs Co-Authored-By: Calvin Giroud <[email protected]>

Added warning logs when evaluator creation fails in DataSkippingFilter: - Log stats selector evaluator failures - Log skipping evaluator failures - Log filter evaluator failures This improves debuggability when data skipping does not occur as expected. Co-Authored-By: Calvin Giroud <[email protected]>

…olving conflicts Co-Authored-By: Calvin Giroud <[email protected]>

scovich

Please update the PR description to correctly follow the breaking change template -- our release management scrapes PR descriptions to build the changelog. It looks like the commented-out template was deleted from the PR description, so you'll need to dig it up from some other PR.

scovich · 2024-12-11T15:58:51Z

kernel/src/scan/data_skipping.rs

+            .map_err(|e| {
+                warn!("Failed to create stats selector evaluator: {}", e);
+                e
+            })


Suggested change

.map_err(|e| {

warn!("Failed to create stats selector evaluator: {}", e);

e

})

.inspect_err(|e| warn!("Failed to create stats selector evaluator: {e}"))

(more below)

scovich · 2024-12-11T16:07:24Z

kernel/src/scan/log_replay.rs

-    action_iter
+    )?;
+
+    Ok(action_iter


nit: it's very subjective, but in general I find this sort of multi-line, multi-operation Ok (or Some, etc) to be unnecessarily hard to read because of all the nesting. Factoring it out as a value is easier to read (and change later):

let actions = action_iter ... .filter(...); Ok(actions)

Wrapping a single function call or struct creation isn't so bad, because conceptually only one thing is happening, e.g. this code above is not particularly difficult to read:

Ok(Arc::new(DefaultExpressionEvaluator { ... }))

Clippy seems to take a similar stance on monadic chains -- it will almost always force newlines between chained function calls, if more than one of the functions takes args or even if the line gets very long (long before the 100-char line limit).

cg-cognition · 2024-12-11T18:41:56Z

Updated the PR description, not quite sure if I did it right! Let me know after you review.

scovich

LGTM except one new type annotation that seems unnecessary?

kernel/src/scan/log_replay.rs

…devin-open-source/delta-kernel-rs into devin/1733609186-fallible-get-evaluator

codecov · 2024-12-11T19:36:40Z

Codecov Report

Attention: Patch coverage is 66.03774% with 18 lines in your changes missing coverage. Please review.

Project coverage is 83.36%. Comparing base (be1453f) to head (fa7c9ff).

Files with missing lines	Patch %	Lines
kernel/src/scan/data_skipping.rs	70.83%	0 Missing and 7 partials ⚠️
kernel/src/scan/mod.rs	37.50%	1 Missing and 4 partials ⚠️
kernel/src/transaction.rs	0.00%	0 Missing and 2 partials ⚠️
kernel/src/engine/default/mod.rs	0.00%	0 Missing and 1 partial ⚠️
kernel/src/scan/log_replay.rs	90.90%	0 Missing and 1 partial ⚠️
kernel/src/table_changes/log_replay.rs	0.00%	0 Missing and 1 partial ⚠️
kernel/src/table_changes/scan.rs	0.00%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #577      +/-   ##
==========================================
- Coverage   83.46%   83.36%   -0.11%     
==========================================
  Files          74       74              
  Lines       16858    16854       -4     
  Branches    16858    16854       -4     
==========================================
- Hits        14071    14050      -21     
- Misses       2129     2130       +1     
- Partials      658      674      +16

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

feat: make ExpressionHandler::get_evaluator fallible

c6025aa

Make get_evaluator() return DeltaResult<Arc<dyn ExpressionEvaluator>> to properly handle potential errors. Update all call sites to handle the Result type and simplify error handling using the ok()? operator where appropriate.

github-actions bot assigned cg-cognition Dec 7, 2024

cg-cognition mentioned this pull request Dec 8, 2024

Engine::get_evaluator should be fallible #566

Open

Merge branch 'main' into devin/1733609186-fallible-get-evaluator

468f988

github-actions bot added the breaking-change Change that will require a version bump label Dec 9, 2024

cg-cognition added 2 commits December 9, 2024 16:44

fix merge

f243d42

Merge branch 'main' into devin/1733609186-fallible-get-evaluator

8302483

scovich reviewed Dec 10, 2024

View reviewed changes

kernel/src/scan/data_skipping.rs Show resolved Hide resolved

kernel/src/scan/log_replay.rs Outdated Show resolved Hide resolved

scovich requested a review from nicklan December 10, 2024 20:38

devin-ai-integration bot and others added 8 commits December 11, 2024 01:14

Merge upstream/main into devin/1733609186-fallible-get-evaluator, res…

cb613dc

…olving conflicts Co-Authored-By: Calvin Giroud <[email protected]>

fix merge

9f6e617

remove comment

91bfcd8

fix all the warnings

7d03266

format

b4bad36

fix

e7b01b2

cg-cognition requested a review from scovich December 11, 2024 02:43

scovich reviewed Dec 11, 2024

View reviewed changes

address comments

6be446d

cg-cognition requested a review from scovich December 11, 2024 17:47

Merge branch 'main' into devin/1733609186-fallible-get-evaluator

0b838f0

scovich approved these changes Dec 11, 2024

View reviewed changes

kernel/src/scan/log_replay.rs Outdated Show resolved Hide resolved

cg-cognition added 3 commits December 11, 2024 14:17

remove types

22f9a3a

Merge branch 'devin/1733609186-fallible-get-evaluator' of github.com:…

51295ea

…devin-open-source/delta-kernel-rs into devin/1733609186-fallible-get-evaluator

format

fa7c9ff

cg-cognition added 2 commits December 13, 2024 07:50

Merge branch 'main' into devin/1733609186-fallible-get-evaluator

4178d01

fix merge

d8bc768

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make `ExpressionHandler::get_evaluator` fallible #577

feat: make `ExpressionHandler::get_evaluator` fallible #577

cg-cognition commented Dec 7, 2024 •

edited

Loading

scovich left a comment

scovich Dec 11, 2024

scovich Dec 11, 2024

cg-cognition commented Dec 11, 2024

scovich left a comment

codecov bot commented Dec 11, 2024

feat: make ExpressionHandler::get_evaluator fallible #577

Are you sure you want to change the base?

feat: make ExpressionHandler::get_evaluator fallible #577

Conversation

cg-cognition commented Dec 7, 2024 • edited Loading

What changes are proposed in this pull request?

How was this change tested?

This PR affects the following public APIs

Additional Context

scovich left a comment

Choose a reason for hiding this comment

scovich Dec 11, 2024

Choose a reason for hiding this comment

scovich Dec 11, 2024

Choose a reason for hiding this comment

cg-cognition commented Dec 11, 2024

scovich left a comment

Choose a reason for hiding this comment

codecov bot commented Dec 11, 2024

Codecov Report

feat: make `ExpressionHandler::get_evaluator` fallible #577

feat: make `ExpressionHandler::get_evaluator` fallible #577

cg-cognition commented Dec 7, 2024 •

edited

Loading