Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of seeding split fate operations #5160

Open
keith-turner opened this issue Dec 10, 2024 · 1 comment
Open

Improve performance of seeding split fate operations #5160

keith-turner opened this issue Dec 10, 2024 · 1 comment
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Milestone

Comments

@keith-turner
Copy link
Contributor

Is your feature request related to a problem? Please describe.

The changes in #5122 improved the performance of seeding fate split operations. However there is still further room for improvement. This code will seed a single split operation which will write a single conditional mutation. Writing single conditional mutations at a time requires waiting on them to committed in the write ahead log. Could get much better performance by writing many conditional mutaitons at once as this would allow many to be written to the write ahead log and then wait for all of them to be committed.

Describe the solution you'd like

The code for seeding split fate operations is restructured so that is takes a batch of tablets that need split operations seeded and writes the entire batch to a conditional writer. This would avoid the single writes. The current code avoids two problems and we would want to continue avoiding these when restructuring the code.

The first problem the current code avoids is attempting to seed the same tablet for split when its currently in the processes of seeding. This can happen when there is a situation like 100 tablets that need to split and tablet group watcher keeps finding these and queueing them up for seeding over and over. This will not cause problems with correctness, it just waste resources.

The second problem the current code avoids is making the tablet group watcher wait on seeding split operations. If there is a problem writing to the fate table to seed split, this will not block the tablet group watcher. Instead it eventually starts dropping/ignoring request to seed split operations.

@keith-turner keith-turner added the enhancement This issue describes a new feature, improvement, or optimization. label Dec 10, 2024
@keith-turner keith-turner added this to the 4.0.0 milestone Dec 10, 2024
@keith-turner keith-turner changed the title Improve performance of seeing split fate operations Improve performance of seeding split fate operations Dec 10, 2024
@keith-turner
Copy link
Contributor Author

Looked into how this could be done and came up with one possible way that seems like it could break into two PRs. First could modify FateStore to remove the following method.

  Optional<FateId> seedTransaction(Fate.FateOperation txName, FateKey fateKey, Repo<T> repo,
      boolean autoCleanUp);

and replace it with the following

  interface Seeder<T> extends AutoCloseable {
    // For UserFateStore this would create a conditional mutation and add it to the conditional writer
    void attemptToSeedTransaction(Fate.FateOperation txName, FateKey fateKey, Repo<T> repo,
                    boolean autoCleanUp);
    // This would check the status of all added conditional mutations, retry unknown, and then close the conditional writer.
    void close();
  }

  // Creates a conditional writer for the user fate store.  For Zookeeper all this code will probably do the same thing its currently doing as zookeeper does not support multi-node operations.
  public Seeder<T> beginSeeding();

The current code in the manager could be left as is to be changed in a second PR. The way that code is structured it only knows about a single thing to seed at a time, so it would go through the three step process with a single thing and have the same behavior until the second PR is done.

In a second PR the manager code could be modified from adding single task to a thread pool to instead add task to a queue and have a single background thread that takes batches of task of the thread pool and uses the new fate store API to write them all at once. There are a few different way this could be done. Would be nice to preserve the current behavior of not concurrently seeding the same thing in order to avoid unnecessary work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This issue describes a new feature, improvement, or optimization.
Projects
None yet
Development

No branches or pull requests

1 participant