Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrow the set of files checked by compaction commit conditional mutation #5117

Closed
keith-turner opened this issue Nov 27, 2024 · 3 comments · Fixed by #5153
Closed

Narrow the set of files checked by compaction commit conditional mutation #5117

keith-turner opened this issue Nov 27, 2024 · 3 comments · Fixed by #5153
Assignees
Labels
bug This issue has been verified to be a bug.
Milestone

Comments

@keith-turner
Copy link
Contributor

keith-turner commented Nov 27, 2024

Describe the bug

For a popular tablet with lots of compactions and bulk imports its set of files is rapidly changing. The compaction commit code reads all the tablets files does some checks against those files and then submits a conditional mutation that has a condition that tablets set of files is exactly the same as what it read earlier. On a busy tablet this condition will often fail causing the entire process to retry and fail a bunch of times before finally succeeding.

Expected behavior

The code should be modified to make a more narrow check. The condition does not need to require all the tablets files to be same. The condition on the mutation could instead check that the tablet must have the files involved in the compaction w/o caring about what other files the tablet might have. This more narrow check would make the conditional mutation much less likely to conflict with other changes being made by concurrent bulk imports and compaction to a tablet.

@keith-turner keith-turner added the bug This issue has been verified to be a bug. label Nov 27, 2024
@keith-turner keith-turner added this to the 4.0.0 milestone Nov 27, 2024
@keith-turner
Copy link
Contributor Author

To fix this probably need to add a method like requireFiles(Set<StoredTabletFile> files) to ample. After adding that could survey the code to see if other places could benefit from a more narrow check.

@cshannon cshannon self-assigned this Dec 4, 2024
@cshannon
Copy link
Contributor

cshannon commented Dec 4, 2024

I can take a look at this issue later this week

@keith-turner
Copy link
Contributor Author

In addition to causing thread contention this issue has the potential to cause memory problems. Ran into the following problem.

Was using the changes in apache/accumulo-testing#287 to generate bulk imports. This bug in 287 apache/accumulo-testing#288 caused some bulk imports to a single tablet to have lots of files. Did not have the 288 bug fix in testing and ended up with a tablet w/ 25K files. There were lots of concurrent compactions running against this tablet. When these finished they would create a conditional mutation which included the 25k files in a condition. These conditional mutations would cause lots of memory pressure on the tablet server.

cshannon added a commit to cshannon/accumulo that referenced this issue Dec 8, 2024
Previously the conditional mutation to commit a compaction would require
all files in the tablet be the same as read earlier and on a busy tablet
this could fail and retry often. The check has now been narrowed to only
verify that the files involved with the compaction still exist. A new
method was added to Ample called requireFiles(Set<StoredTabletFile> files)
which creates a condition for each file column to verify each one
exists.

This closes apache#5117
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue has been verified to be a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants