-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to ReplicaOrdering.RANDOM
for select LBPs
#32
Switch to ReplicaOrdering.RANDOM
for select LBPs
#32
Conversation
This setting has the benefit of evenly distributing the load across replicas. Using round robin policies with `NEUTRAL` ordering can easily lead to spikes in load on singular nodes during cluster grow and uneven workload afterwards when using tablets. The reason for not switching to `RANDOM` for rack aware LBP right now is that it is slightly broken in that configuration. See java-driver/369.
Previous switch to |
|
||
if (settings.node.rack != null) { | ||
RackAwareRoundRobinPolicy.Builder policyBuilder = RackAwareRoundRobinPolicy.builder(); | ||
if (settings.node.datacenter != null) | ||
policyBuilder.withLocalDc(settings.node.datacenter); | ||
policyBuilder = policyBuilder.withLocalRack(settings.node.rack); | ||
ret = policyBuilder.build(); | ||
replicaOrdering = ReplicaOrdering.NEUTRAL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this mean that a mix of using Rackaware and tablets would be imbalanced ?
and would be needed to be fix on the driver end ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All round robin policies (rack,dc) used with TokenAwarePolicy can be imbalanced with tablets when using neutral ordering. I think this combination should not be used if we want load to be as balanced as possible. Long story short let's say we have RF=3 and 6 nodes [A,B,C,D,E,F] and tablets are spread evenly but only on ABC (this can happen when growing the cluster). If round robin happens to point to either D,E,F,A then that request will hit replica A first. This results in A getting 4/6 of the load, B 1/6, and C 1/6. However if RF=3 and cluster has only A,B,C then all will be nearly perfectly balanced.
I'll try to make a comment with broader explanation what happens in scylladb/scylladb#19107 to better illustrate this issue with neutral ordering.
The rack aware one just does not work correctly (will ignore rack awareness in favor of local dc) with random ordering so it has to stay neutral for now. This is the part that needs fixing on driver's end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]>
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]>
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]>
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]>
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]>
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6) # Conflicts: # defaults/docker_images/cassandra-stress/values_cassandra-stress.yaml
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6) # Conflicts: # defaults/docker_images/cassandra-stress/values_cassandra-stress.yaml
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6) # Conflicts: # defaults/docker_images/cassandra-stress/values_cassandra-stress.yaml
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6)
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6) # Conflicts: # defaults/docker_images/cassandra-stress/values_cassandra-stress.yaml
Main reason for version change: Using cassandra-stress 3.17 to mittigate - Switch to ReplicaOrdering.RANDOM for select LBPs [32](scylladb/cassandra-stress#32) Other Noticable Changes since the last version used in SCT: - Add support for hostname verification [31](scylladb/cassandra-stress#31) - Print thread dump on specific signals [27](scylladb/cassandra-stress#27) - Replace uninterruptible wait [26](scylladb/cassandra-stress#26) - Make it use DCAwareRoundRobinPolicy unless rack is provided [21](scylladb/cassandra-stress#21) - feature(docker): adding support for dependabot [19](scylladb/cassandra-stress#19) Signed-off-by: Dusan Malusev <[email protected]> (cherry picked from commit 02997a6) # Conflicts: # defaults/docker_images/cassandra-stress/values_cassandra-stress.yaml
This setting has the benefit of evenly distributing the load across replicas. Using round robin policies with
NEUTRAL
ordering can easily lead to spikes in load on singular nodes during cluster grow and uneven workload afterwards when using tablets.The reason for not switching to
RANDOM
for rack aware LBP right now is that it is slightly broken in that configuration.See java-driver/369.