Skip to content

Support RowDeltaAction #1104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks
ZENOTME opened this issue Mar 18, 2025 · 6 comments
Open
3 tasks

Support RowDeltaAction #1104

ZENOTME opened this issue Mar 18, 2025 · 6 comments
Labels
enhancement New feature or request

Comments

@ZENOTME
Copy link
Contributor

ZENOTME commented Mar 18, 2025

Is your feature request related to a problem or challenge?

As #798#1081, there are requirements for append delete data files (position delete, equality delete). This action is used to support the append of this kind of file.

Describe the solution you'd like

The path to support:

Willingness to contribute

None

@ZENOTME
Copy link
Contributor Author

ZENOTME commented Mar 18, 2025

cc @Fokko @liurenjie1024

@jonathanc-n
Copy link
Contributor

For metadata conflict detection, what is the exact design outline that you are looking to implement?

For the row level detection I can start the implementation the manifest filter manager and manifest merge manager to build towards the merging snapshot producer used in the RowDelta. This can probably be done after delete files are fully implemented

@ZENOTME
Copy link
Contributor Author

ZENOTME commented Mar 21, 2025

For metadata conflict detection, what is the exact design outline that you are looking to implement?

conflict detection implementation based on the validation phase. I would like to introduce the validation phase at SnapshotProduce apply(). After introducing it, we can have some specific implementation of kinds of validation.

For the row level detection I can start the implementation the manifest filter manager and manifest merge manager to build towards the merging snapshot producer used in the RowDelta. This can probably be done after delete files are fully implemented

Thanks @jonathanc-n!

@jonathanc-n
Copy link
Contributor

I'll take a deeper look into the implementation tomorrow

@jonathanc-n
Copy link
Contributor

@ZENOTME Sorry for the delay, i was a little busy with family. In the example you showed in the python implementation, the filtering for OverwriteFiles and DeleteFiles seems to be inherited from the snapshot producer. We could create the MergingSnapshotManager or something similar and have it be called when an action that produces a new snapshot. I think we do still need some things before this happens.

@jonathanc-n
Copy link
Contributor

jonathanc-n commented Mar 24, 2025

One thing I would like to implement before the filter manager are residual predicates mentioned by @Fokko here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants