-
Notifications
You must be signed in to change notification settings - Fork 239
Support commit retrie #964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks @ZENOTME for raising this. The core part of commit of conflict detection for different isolation levels, which is quite hard to implement correctly. Retry itself is not a big problem, it's just a reload of table metadata and do conflict detection. I'm not familiar with |
Thanks @liurenjie1024. I will take some effort to investigate more about conflict detection. |
Hi, I take some time to figure out the whole commit process. The whole commit phase can be described as follows:
For now, we only support FastAppend. So we can complete the whole process based on the FastAppend first and complete conflict detection when we introduce other update actions. How do you think? @liurenjie1024 |
This sounds reasonable to me. |
Let's move. I will work on this later |
@ZENOTME Are you currently working on this? I'm looing to make some progress on this |
Hi, I believe it should not be an extra kind. Instead, we can add a |
Hi @jonathanc-n, for the transaction part, we better waiting for #949 to merge first because it involves interface change. Maybe you can on the retryable error first. |
LGTM! |
I'm a bit confused about this part. First of all should we do a operation match and not have to append requirements if there are only append operations. What do you think about a metadata check (a check if new metadata was already committed, if table already exists, etc.) before we append updates or requirements. The process would be to:
Any alternate suggestions or fixes? cc @liurenjie1024 @Fokko @Xuanwo @sdd @ZENOTME @c-thiel |
Sorry for replying late.
I think we still need an append requirement for the append operation. It just doesn't need the conflict detection, but it still needs a requirement. According to my understanding, the requirement is to check whether there are concurrent snapshot updates. It uses If the catalog identifies that uuid didn't match, it means that there are concurrent updates. The catalog returns
If metadata check fail, I think we should abort because it means that not also there are concurrent update but also conflict. As do in java: https://github.com/apache/iceberg/blob/6d8653bb2bbcecdbe38e98da60ef5578f083e933/core/src/main/java/org/apache/iceberg/BaseTransaction.java#L512 |
I would like to separate this task into multiple steps:
We can introduce a new
ErrorKind::RetryableCommitError
to abstract kinds of catalog errors.About the retry library, personally, I think https://github.com/Xuanwo/backon can be a good candidate. Its maintainer is @Xuanwo. (Thanks for this great job!)
Welcome more suggestions and elaborations. cc @Fokko @Xuanwo @liurenjie1024 @sdd
The text was updated successfully, but these errors were encountered: