|
1 | 1 | ## Load balancing
|
2 | 2 |
|
3 |
| -*Coming soon... In the meantime, see the javadoc for [LoadBalancingPolicy].* |
| 3 | +A Cassandra cluster is typically composed of multiple hosts; the [LoadBalancingPolicy] \(sometimes abbreviated LBP) is a |
| 4 | +central component that determines: |
4 | 5 |
|
5 |
| -[LoadBalancingPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LoadBalancingPolicy.html |
| 6 | +* which hosts the driver will communicate with; |
| 7 | +* for each new query, which coordinator to pick, and which hosts to use as failover. |
| 8 | + |
| 9 | +The policy is configured when initializing the cluster: |
| 10 | + |
| 11 | +```java |
| 12 | +Cluster cluster = Cluster.builder() |
| 13 | + .addContactPoint("127.0.0.1") |
| 14 | + .withLoadBalancingPolicy(new RoundRobinPolicy()) |
| 15 | + .build(); |
| 16 | +``` |
| 17 | + |
| 18 | +Once the cluster has been built, you can't change the policy, but you may inspect it at runtime: |
| 19 | + |
| 20 | +```java |
| 21 | +LoadBalancingPolicy lbp = cluster.getConfiguration().getPolicies().getLoadBalancingPolicy(); |
| 22 | +``` |
| 23 | + |
| 24 | +If you don't explicitly configure the policy, you get the default, which is a |
| 25 | +[datacenter-aware](#dc-aware-round-robin-policy), [token-aware](#token-aware-policy) policy: |
| 26 | + |
| 27 | +```java |
| 28 | +new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().build()); |
| 29 | +``` |
| 30 | + |
| 31 | +### Common concepts |
| 32 | + |
| 33 | +Before we review the available implementations, we need to introduce two concepts: |
| 34 | + |
| 35 | +#### Host distance |
| 36 | + |
| 37 | +For each host, the policy computes a **[distance][HostDistance]** that determines how the driver will establish connections |
| 38 | +to it: |
| 39 | + |
| 40 | +* `LOCAL` and `REMOTE` are "active" distances, meaning that the driver will keep open connections to the host. They |
| 41 | + differ in the number of connections opened, depending on your [pooling options](../pooling/). Also, the |
| 42 | + [control connection](../control_connection/) will favor local nodes if possible. |
| 43 | +* `IGNORED`, as the name suggests, means that the driver will not attempt to connect. |
| 44 | + |
| 45 | +Typically, the distance will reflect network topology (e.g. local vs. remote datacenter), although that is entirely up |
| 46 | +to your policy. The distance can be dynamic: the driver re-checks it whenever connection pools are created (e.g. at |
| 47 | +startup or if a node was down and just came back up); you can also trigger it with [refreshConnectedHosts]: |
| 48 | + |
| 49 | +```java |
| 50 | +// Re-evaluate all host distances: |
| 51 | +cluster.getConfiguration().getPoolingOptions().refreshConnectedHosts(); |
| 52 | + |
| 53 | +// Re-evaluate the distance for a given host: |
| 54 | +cluster.getConfiguration().getPoolingOptions().refreshConnectedHost(host); |
| 55 | +``` |
| 56 | + |
| 57 | +#### Query plan |
| 58 | + |
| 59 | +Each time the driver executes a query, it asks the policy to compute a **query plan**, which is a list of hosts. The |
| 60 | +driver will then try each host in sequence, according to the [retry policy](../retries/) and |
| 61 | +[speculative execution policy](../speculative_execution). |
| 62 | + |
| 63 | +The contents and order of query plans are entirely up to your policy, but implementations typically return plans that: |
| 64 | + |
| 65 | +* are different for each query, in order to balance the load across the cluster; |
| 66 | +* only contain hosts that are known to be able to process queries, i.e. neither ignored nor down; |
| 67 | +* favor local hosts over remote ones. |
| 68 | + |
| 69 | +The next sections describe the implementations that are provided with the driver. |
| 70 | + |
| 71 | + |
| 72 | +### [RoundRobinPolicy] |
| 73 | + |
| 74 | +```java |
| 75 | +Cluster cluster = Cluster.builder() |
| 76 | + .addContactPoint("127.0.0.1") |
| 77 | + .withLoadBalancingPolicy(new RoundRobinPolicy()) |
| 78 | + .build(); |
| 79 | +``` |
| 80 | + |
| 81 | +This is the most straightforward implementation. It returns query plans that include all hosts, and shift for each |
| 82 | +query in a round-robin fashion. For example: |
| 83 | + |
| 84 | +* query 1: host1, host2, host3 |
| 85 | +* query 2: host2, host3, host1 |
| 86 | +* query 3: host3, host1, host2 |
| 87 | +* query 4: host1, host2, host3 |
| 88 | +* etc. |
| 89 | + |
| 90 | +All hosts are at distance `LOCAL`. |
| 91 | + |
| 92 | +This works well for simple deployments. If you have multiple datacenters, it will be inefficient and you probably want |
| 93 | +to switch to a DC-aware policy. |
| 94 | + |
| 95 | + |
| 96 | +### [DCAwareRoundRobinPolicy] |
| 97 | + |
| 98 | +```java |
| 99 | +Cluster cluster = Cluster.builder() |
| 100 | + .addContactPoint("127.0.0.1") |
| 101 | + .withLoadBalancingPolicy( |
| 102 | + DCAwareRoundRobinPolicy.builder() |
| 103 | + .withLocalDc("myLocalDC") |
| 104 | + .withUsedHostsPerRemoteDc(2) |
| 105 | + .allowRemoteDCsForLocalConsistencyLevel() |
| 106 | + .build() |
| 107 | + ).build(); |
| 108 | +``` |
| 109 | + |
| 110 | +This policy queries nodes of the local data-center in a round-robin fashion; optionally, it can also try a configurable |
| 111 | +number of hosts in remote data centers if all local hosts failed. |
| 112 | + |
| 113 | +Call `withLocalDc` to specify the name of your local datacenter. You can also leave it out, and the driver will use the |
| 114 | +datacenter of the first contact point that was reached [at initialization](../#cluster-initialization). However, |
| 115 | +remember that the driver shuffles the initial list of contact points, so this assumes that all contact points are in the |
| 116 | +local datacenter. In general, providing the datacenter name explicitly is a safer option. |
| 117 | + |
| 118 | +Hosts belonging to the local datacenter are at distance `LOCAL`, and appear first in query plans (in a round-robin |
| 119 | +fashion). |
| 120 | + |
| 121 | +If you call `withUsedHostsPerRemoteDc`, the policy will pick that number of hosts for each remote DC, and add them at |
| 122 | +the end of query plans. To illustrate this, let's assume that the value is 2, there are 3 datacenters and 3 hosts in the |
| 123 | +local datacenter. Query plans would look like this: |
| 124 | + |
| 125 | +* query 1: localHost1, localHost2, localHost3, host1InRemoteDc1, host2InRemoteDc1, host1InRemoteDc2, host2InRemoteDc2 |
| 126 | +* query 2: localHost2, localHost3, localHost1, host1InRemoteDc1, host2InRemoteDc1, host1InRemoteDc2, host2InRemoteDc2 |
| 127 | +* query 3: localHost3, localHost1, localHost2, host1InRemoteDc1, host2InRemoteDc1, host1InRemoteDc2, host2InRemoteDc2 |
| 128 | + |
| 129 | +Hosts selected by this option are at distance `REMOTE`. Note that they always appear in the same order. |
| 130 | + |
| 131 | +Finally, `allowRemoteDCsForLocalConsistencyLevel` controls whether remote hosts included by the previous option are |
| 132 | +included when the consistency level of the query is `LOCAL_ONE` or `LOCAL_QUORUM`. By default, it is off (remote hosts |
| 133 | +are not included for local CLs). |
| 134 | + |
| 135 | + |
| 136 | +### [TokenAwarePolicy] |
| 137 | + |
| 138 | +```java |
| 139 | +Cluster cluster = Cluster.builder() |
| 140 | + .addContactPoint("127.0.0.1") |
| 141 | + .withLoadBalancingPolicy(new TokenAwarePolicy(anotherPolicy)) |
| 142 | + .build(); |
| 143 | +``` |
| 144 | + |
| 145 | +This policy adds **token awareness** on top of another policy: requests will be routed in priority to the local replicas |
| 146 | +that own the data that is being queried. |
| 147 | + |
| 148 | +#### Requirements |
| 149 | + |
| 150 | +In order for token awareness to work, you should first ensure that metadata is enabled in the driver. That is the case |
| 151 | +by default, unless it's been explicitly disabled by [QueryOptions#setMetadataEnabled][setMetadataEnabled]. |
| 152 | + |
| 153 | +Then you need to consider whether routing information (provided by [Statement#getKeyspace] and |
| 154 | +[Statement#getRoutingKey]) can be computed automatically for your statements; if not, you may provide it yourself (if a |
| 155 | +statement has no routing information, the query will still be executed, but token awareness will not work, so the driver |
| 156 | +might not pick the best coordinator). |
| 157 | + |
| 158 | +The examples assume the following CQL schema: |
| 159 | + |
| 160 | +``` |
| 161 | +CREATE TABLE testKs.sensor_data(id int, year int, ts timestamp, data double, |
| 162 | + PRIMARY KEY ((id, year), ts)); |
| 163 | +``` |
| 164 | + |
| 165 | +For [simple statements](../statements/simple/), routing information can never be computed automatically: |
| 166 | + |
| 167 | +```java |
| 168 | +SimpleStatement statement = new SimpleStatement( |
| 169 | + "SELECT * FROM testKs.sensor_data WHERE id = 1 and year = 2016"); |
| 170 | + |
| 171 | +// No routing info available: |
| 172 | +assert statement.getKeyspace() == null; |
| 173 | +assert statement.getRoutingKey() == null; |
| 174 | + |
| 175 | +// Set the keyspace manually: |
| 176 | +statement.setKeyspace("testKs"); |
| 177 | + |
| 178 | +// Set the routing key manually: serialize each partition key component to its target CQL type |
| 179 | +ProtocolVersion protocolVersion = cluster.getConfiguration().getProtocolOptions().getProtocolVersionEnum(); |
| 180 | +statement.setRoutingKey( |
| 181 | + DataType.cint().serialize(1, protocolVersion), |
| 182 | + DataType.cint().serialize(2016, protocolVersion)); |
| 183 | + |
| 184 | +session.execute(statement); |
| 185 | +``` |
| 186 | + |
| 187 | +For [built statements](../statements/built/), the keyspace is available if it was provided while building the query; the |
| 188 | +routing key is available only if the statement was built using the table metadata, and all components of the partition |
| 189 | +key appear in the query: |
| 190 | + |
| 191 | +```java |
| 192 | +TableMetadata tableMetadata = cluster.getMetadata() |
| 193 | + .getKeyspace("testKs") |
| 194 | + .getTable("sensor_data"); |
| 195 | + |
| 196 | +// Built from metadata: all info available |
| 197 | +BuiltStatement statement1 = select().from(tableMetadata) |
| 198 | + .where(eq("id", 1)) |
| 199 | + .and(eq("year", 2016)); |
| 200 | + |
| 201 | +assert statement1.getKeyspace() != null; |
| 202 | +assert statement1.getRoutingKey() != null; |
| 203 | + |
| 204 | +// Built from keyspace and table name: only keyspace available |
| 205 | +BuiltStatement statement2 = select().from("testKs", "sensor") |
| 206 | + .where(eq("id", 1)) |
| 207 | + .and(eq("year", 2016)); |
| 208 | + |
| 209 | +assert statement2.getKeyspace() != null; |
| 210 | +assert statement2.getRoutingKey() == null; |
| 211 | +``` |
| 212 | + |
| 213 | +For [bound statements](../statements/prepared/), the keyspace is always available; the routing key is only available if |
| 214 | +all components of the partition key are bound as variables: |
| 215 | + |
| 216 | +```java |
| 217 | +// All components bound: all info available |
| 218 | +PreparedStatement pst1 = session.prepare("SELECT * FROM testKs.sensor_data WHERE id = :id and year = :year"); |
| 219 | +BoundStatement statement1 = pst1.bind(1, 2016); |
| 220 | + |
| 221 | +assert statement1.getKeyspace() != null; |
| 222 | +assert statement1.getRoutingKey() != null; |
| 223 | + |
| 224 | +// 'id' hard-coded, only 'year' is bound: only keyspace available |
| 225 | +PreparedStatement pst2 = session.prepare("SELECT * FROM testKs.sensor_data WHERE id = 1 and year = :year"); |
| 226 | +BoundStatement statement2 = pst2.bind(2016); |
| 227 | + |
| 228 | +assert statement2.getKeyspace() != null; |
| 229 | +assert statement2.getRoutingKey() == null; |
| 230 | +``` |
| 231 | + |
| 232 | +For [batch statements](../statements/batch/), the routing information of each child statement is inspected; the first |
| 233 | +non-null keyspace is used as the keyspace of the batch, and the first non-null routing key as its routing key (the idea |
| 234 | +is that all childs should have the same routing information, since batches are supposed to operate on a single |
| 235 | +partition). All children might have null information, in which case you need to provide the information manually as |
| 236 | +shown previously. |
| 237 | + |
| 238 | +#### Behavior |
| 239 | + |
| 240 | +For any host, the distance returned by `TokenAwarePolicy` is always the same as its child policy. |
| 241 | + |
| 242 | +When the policy computes a query plan, it will first inspect the statement's routing information. If there is none, the |
| 243 | +policy simply acts as a pass-through, and returns the query plan computed by its child policy. |
| 244 | + |
| 245 | +If the statement has routing information, the policy uses it to determine the replicas that hold the corresponding data. |
| 246 | +Then it returns a query plan containing: |
| 247 | + |
| 248 | +1. the replicas for which the child policy returns distance `LOCAL`, shuffled in a random order; |
| 249 | +2. followed by the query plan of the child policy, skipping any host that were already returned by the previous step. |
| 250 | + |
| 251 | +Finally, the `shuffleReplicas` constructor parameter allows you to control whether the policy shuffles the replicas in |
| 252 | +step 1: |
| 253 | + |
| 254 | +```java |
| 255 | +new TokenAwarePolicy(anotherPolicy, false); // no shuffling |
| 256 | +``` |
| 257 | + |
| 258 | +Shuffling will distribute writes better, and can alleviate hotspots caused by "fat" partitions. On the other hand, |
| 259 | +setting it to `false` might increase the effectiveness of caching, since data will always be retrieved from the |
| 260 | +"primary" replica. Shuffling is enabled by default. |
| 261 | + |
| 262 | +### [LatencyAwarePolicy] |
| 263 | + |
| 264 | +```java |
| 265 | +Cluster cluster = Cluster.builder() |
| 266 | + .addContactPoint("127.0.0.1") |
| 267 | + .withLoadBalancingPolicy( |
| 268 | + LatencyAwarePolicy.builder(anotherPolicy) |
| 269 | + .withExclusionThreshold(2.0) |
| 270 | + .withScale(100, TimeUnit.MILLISECONDS) |
| 271 | + .withRetryPeriod(10, TimeUnit.SECONDS) |
| 272 | + .withUpdateRate(100, TimeUnit.MILLISECONDS) |
| 273 | + .withMininumMeasurements(50) |
| 274 | + .build() |
| 275 | + ).build(); |
| 276 | +``` |
| 277 | + |
| 278 | +This policy adds **latency awareness** on top of another policy: it collects the latencies of queries to each host, and |
| 279 | +will exclude the worst-performing hosts from query plans. |
| 280 | + |
| 281 | +The builder allow you to customize various aspects of the policy: |
| 282 | + |
| 283 | +* the [exclusion threshold][withExclusionThreshold] controls how much worse a host must perform (compared to the fastest |
| 284 | + host) in order to be excluded. For example, 2 means that hosts that are twice slower will be excluded; |
| 285 | +* since a host's performance can vary over time, its score is computed with a time-weighted average; the |
| 286 | + [scale][withScale] controls how fast the weight given to older latencies decreases over time; |
| 287 | +* the [retry period][withRetryPeriod] is the duration for which a slow host will be penalized; |
| 288 | +* the [update rate][withUpdateRate] defines how often the minimum average latency (i.e. the fastest host) is recomputed; |
| 289 | +* the [minimum measurements][withMininumMeasurements] threshold guarantees that we have enough measurements before we |
| 290 | + start excluding a host. This prevents skewing the measurements during a node restart, where JVM warm-up will influence |
| 291 | + latencies. |
| 292 | + |
| 293 | +For any host, the distance returned by the policy is always the same as its child policy. |
| 294 | + |
| 295 | +Query plans are based on the child policy's, except that hosts that are currently excluded for being too slow are moved |
| 296 | +to the end of the plan. |
| 297 | + |
| 298 | +[withExclusionThreshold]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withExclusionThreshold-double- |
| 299 | +[withMininumMeasurements]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withMininumMeasurements-int- |
| 300 | +[withRetryPeriod]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withRetryPeriod-long-java.util.concurrent.TimeUnit- |
| 301 | +[withScale]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withScale-long-java.util.concurrent.TimeUnit- |
| 302 | +[withUpdateRate]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withUpdateRate-long-java.util.concurrent.TimeUnit- |
| 303 | + |
| 304 | +### Filtering policies |
| 305 | + |
| 306 | +[WhiteListPolicy] wraps another policy with a white list, to ensure that the driver will only ever connect to a |
| 307 | +pre-defined subset of the cluster. The distance will be that of the child policy for hosts that are in the white list, |
| 308 | +and `IGNORED` otherwise. Query plans are guaranteed to only contain white-listed hosts. |
| 309 | + |
| 310 | +[HostFilterPolicy] is a generalization of that concept, where you provide the predicate that will determine if a host is |
| 311 | +included or not. |
| 312 | + |
| 313 | +### Implementing your own |
| 314 | + |
| 315 | +If none of the provided policies fit your use case, you can write your own. This is an advanced topic, so we recommend |
| 316 | +studying the existing implementations first: `RoundRobinPolicy` is a good place to start, then you can look at more |
| 317 | +complex ones like `DCAwareRoundRobinPolicy`. |
| 318 | + |
| 319 | + |
| 320 | +[LoadBalancingPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LoadBalancingPolicy.html |
| 321 | +[RoundRobinPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/RoundRobinPolicy.html |
| 322 | +[DCAwareRoundRobinPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/DCAwareRoundRobinPolicy.html |
| 323 | +[TokenAwarePolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/TokenAwarePolicy.html |
| 324 | +[LatencyAwarePolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.html |
| 325 | +[HostFilterPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/HostFilterPolicy.html |
| 326 | +[WhiteListPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/WhiteListPolicy.html |
| 327 | +[HostDistance]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/HostDistance.html |
| 328 | +[refreshConnectedHosts]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/PoolingOptions.html#refreshConnectedHosts-- |
| 329 | +[setMetadataEnabled]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/QueryOptions.html#setMetadataEnabled-boolean- |
| 330 | +[Statement#getKeyspace]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/Statement.html#getKeyspace-- |
| 331 | +[Statement#getRoutingKey]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/Statement.html#getRoutingKey-- |
0 commit comments