Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async operation for writing append record #379

Open
kishorekrd opened this issue Oct 1, 2022 · 1 comment
Open

Async operation for writing append record #379

kishorekrd opened this issue Oct 1, 2022 · 1 comment

Comments

@kishorekrd
Copy link

Currently Nuraft has blocking append call (end_of_append_batch) for flush/store the append log records. Is there any way to make this as async operation.
At the Follower, append is executed by nuraft thread, but on the Leader, where you submit the log record to raft, it has to block on end_of_append_batch() after multiple append calls. Is there any way to handover this to Nuraft thread, so that Leader server thread doesn't need to block on this?

@greensky00
Copy link
Contributor

greensky00 commented Oct 4, 2022

First of all, please note that appended logs should be durable on disk at the time append_entries finishes its job, otherwise data loss may happen.

There can be two options:

  1. You can use async handler mode,

    /**
    * `append_entries()` will return immediately,
    * and callback function (i.e., handler) will be
    * invoked after it is committed in leader node.
    */
    async_handler = 0x1,

    and let end_of_append_batch invoke another thread and make it flush the disk. You should regard the completion of append_entries only when a) append_entries gets the successful result AND b) the disk flush is done.

  2. You can use this (parallel_log_appending) experimental feature:

    /**
    * (Experimental)
    * If `true`, users can let the leader append logs parallel with their
    * replication. To implement parallel log appending, users need to make
    * `log_store::append`, `log_store::write_at`, or
    * `log_store::end_of_append_batch` API triggers asynchronous disk writes
    * without blocking the thread. Even while the disk write is in progress,
    * the other read APIs of log store should be able to read the log.
    *
    * The replication and the disk write will be executed in parallel,
    * and users need to call `raft_server::notify_log_append_completion`
    * when the asynchronous disk write is done. Also, users need to properly
    * implement `log_store::last_durable_index` API to return the most recent
    * durable log index. The leader will commit the log based on the
    * result of this API.
    *
    * - If the disk write is done earlier than the replication,
    * the commit behavior is the same as the original protocol.
    *
    * - If the replication is done earlier than the disk write,
    * the leader will commit the log based on the quorum except
    * for the leader itself. The leader can apply the log to
    * the state machine even before completing the disk write
    * of the log.
    *
    * Note that parallel log appending is available for the leader only,
    * and followers will wait for `notify_log_append_completion` call
    * before returning the response.
    */
    bool parallel_log_appending_;

    To do this, your append, write_at, and end_of_append_batch should trigger asynchronous disk write, and also you should implement notify_log_append_completion and last_durable_index properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants