Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sentinel Failover Replica Choice #160

Open
Niennienzz opened this issue Feb 26, 2024 · 5 comments
Open

Sentinel Failover Replica Choice #160

Niennienzz opened this issue Feb 26, 2024 · 5 comments
Assignees
Labels
good first issue Good for newcomers help wanted Extra attention is needed

Comments

@Niennienzz
Copy link
Contributor

I am not sure if the operator is already implemented this way, but below is how Sentinel decides which replica to choose when performing a failover. If not, maybe the operator can use a similar logic to pick the most desired replica instance.

Step-1 Use the replica with the lowest replica-priority

  • As documented here, Sentinal prioritizes the replica-priority value when choosing a replica for failover.
  • This value is returned by the Redis INFO command as slave_priority.
  • If all replicas have the same replica-priority value, go to Step-2.

Step-2 Use the replica with the highest slave_repl_offset.

  • The slave_repl_offset value, as returned by the INFO command, reports the replication offset of the replica instance.
  • A replica instance with a higher slave_repl_offset value means that it is the closest to the primary instance in terms of replication, thus more suitable for a promotion.
  • If more than one replica instances have the same slave_repl_offset value, go to Step-3.

Step-3 Use the lowest run_id value.

  • Also returned by the INFO command.
  • Sentinel defaults to the replica instance with the lowest run_id value as the last resort.
@Niennienzz
Copy link
Contributor Author

There's a network screening process before this whole candidate selection process, I will update the description once I have a grasp on that.

@Pothulapati Pothulapati added good first issue Good for newcomers help wanted Extra attention is needed labels Mar 13, 2024
@Pothulapati
Copy link
Collaborator

Now that Dragonfly has the offset, It totally makes sense to use it and be intelligent about the choice! This should also be an easy fix considering all the parts are already there.

@nujragan93
Copy link
Contributor

I would like to take a stab at this

@ashotland
Copy link
Contributor

Thanks @nujragan93 🙇 assigned to you.

@nujragan93
Copy link
Contributor

nujragan93 commented Apr 4, 2024

Which version of dragonflyDb gives you slave_priority, slave_repl_offset and run_id when running INFO command, I dont see with v1.16.0

# Replication
role:replica
master_host:10.1xx
master_port:9999
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
master_replid:1604ae320e0dadcbd2bcb200030f195058e608e0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants