Skip to content

Commit ffb5a85

Browse files
committed
Document load balancing.
1 parent fefa665 commit ffb5a85

File tree

5 files changed

+348
-11
lines changed

5 files changed

+348
-11
lines changed

manual/load_balancing/README.md

+328-2
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,331 @@
11
## Load balancing
22

3-
*Coming soon... In the meantime, see the javadoc for [LoadBalancingPolicy].*
3+
A Cassandra cluster is typically composed of multiple hosts; the [LoadBalancingPolicy] \(sometimes abbreviated LBP) is a
4+
central component that determines:
45

5-
[LoadBalancingPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LoadBalancingPolicy.html
6+
* which hosts the driver will communicate with;
7+
* for each new query, which coordinator to pick, and which hosts to use as failover.
8+
9+
The policy is configured when initializing the cluster:
10+
11+
```java
12+
Cluster cluster = Cluster.builder()
13+
.addContactPoint("127.0.0.1")
14+
.withLoadBalancingPolicy(new RoundRobinPolicy())
15+
.build();
16+
```
17+
18+
Once the cluster has been built, you can't change the policy, but you may inspect it at runtime:
19+
20+
```java
21+
LoadBalancingPolicy lbp = cluster.getConfiguration().getPolicies().getLoadBalancingPolicy();
22+
```
23+
24+
If you don't explicitly configure the policy, you get the default, which is a
25+
[datacenter-aware](#dc-aware-round-robin-policy), [token-aware](#token-aware-policy) policy:
26+
27+
```java
28+
new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder().build());
29+
```
30+
31+
### Common concepts
32+
33+
Before we review the available implementations, we need to introduce two concepts:
34+
35+
#### Host distance
36+
37+
For each host, the policy computes a **[distance][HostDistance]** that determines how the driver will establish connections
38+
to it:
39+
40+
* `LOCAL` and `REMOTE` are "active" distances, meaning that the driver will keep open connections to the host. They
41+
differ in the number of connections opened, depending on your [pooling options](../pooling/). Also, the
42+
[control connection](../control_connection/) will favor local nodes if possible.
43+
* `IGNORED`, as the name suggests, means that the driver will not attempt to connect.
44+
45+
Typically, the distance will reflect network topology (e.g. local vs. remote datacenter), although that is entirely up
46+
to your policy. The distance can be dynamic: the driver re-checks it whenever connection pools are created (e.g. at
47+
startup or if a node was down and just came back up); you can also trigger it with [refreshConnectedHosts]:
48+
49+
```java
50+
// Re-evaluate all host distances:
51+
cluster.getConfiguration().getPoolingOptions().refreshConnectedHosts();
52+
53+
// Re-evaluate the distance for a given host:
54+
cluster.getConfiguration().getPoolingOptions().refreshConnectedHost(host);
55+
```
56+
57+
#### Query plan
58+
59+
Each time the driver executes a query, it asks the policy to compute a **query plan**, which is a list of hosts. The
60+
driver will then try each host in sequence, according to the [retry policy](../retries/) and
61+
[speculative execution policy](../speculative_execution).
62+
63+
The contents and order of query plans are entirely up to your policy, but implementations typically return plans that:
64+
65+
* are different for each query, in order to balance the load across the cluster;
66+
* only contain hosts that are known to be able to process queries, i.e. neither ignored nor down;
67+
* favor local hosts over remote ones.
68+
69+
The next sections describe the implementations that are provided with the driver.
70+
71+
72+
### [RoundRobinPolicy]
73+
74+
```java
75+
Cluster cluster = Cluster.builder()
76+
.addContactPoint("127.0.0.1")
77+
.withLoadBalancingPolicy(new RoundRobinPolicy())
78+
.build();
79+
```
80+
81+
This is the most straightforward implementation. It returns query plans that include all hosts, and shift for each
82+
query in a round-robin fashion. For example:
83+
84+
* query 1: host1, host2, host3
85+
* query 2: host2, host3, host1
86+
* query 3: host3, host1, host2
87+
* query 4: host1, host2, host3
88+
* etc.
89+
90+
All hosts are at distance `LOCAL`.
91+
92+
This works well for simple deployments. If you have multiple datacenters, it will be inefficient and you probably want
93+
to switch to a DC-aware policy.
94+
95+
96+
### [DCAwareRoundRobinPolicy]
97+
98+
```java
99+
Cluster cluster = Cluster.builder()
100+
.addContactPoint("127.0.0.1")
101+
.withLoadBalancingPolicy(
102+
DCAwareRoundRobinPolicy.builder()
103+
.withLocalDc("myLocalDC")
104+
.withUsedHostsPerRemoteDc(2)
105+
.allowRemoteDCsForLocalConsistencyLevel()
106+
.build()
107+
).build();
108+
```
109+
110+
This policy queries nodes of the local data-center in a round-robin fashion; optionally, it can also try a configurable
111+
number of hosts in remote data centers if all local hosts failed.
112+
113+
Call `withLocalDc` to specify the name of your local datacenter. You can also leave it out, and the driver will use the
114+
datacenter of the first contact point that was reached [at initialization](../#cluster-initialization). However,
115+
remember that the driver shuffles the initial list of contact points, so this assumes that all contact points are in the
116+
local datacenter. In general, providing the datacenter name explicitly is a safer option.
117+
118+
Hosts belonging to the local datacenter are at distance `LOCAL`, and appear first in query plans (in a round-robin
119+
fashion).
120+
121+
If you call `withUsedHostsPerRemoteDc`, the policy will pick that number of hosts for each remote DC, and add them at
122+
the end of query plans. To illustrate this, let's assume that the value is 2, there are 3 datacenters and 3 hosts in the
123+
local datacenter. Query plans would look like this:
124+
125+
* query 1: localHost1, localHost2, localHost3, host1InRemoteDc1, host2InRemoteDc1, host1InRemoteDc2, host2InRemoteDc2
126+
* query 2: localHost2, localHost3, localHost1, host1InRemoteDc1, host2InRemoteDc1, host1InRemoteDc2, host2InRemoteDc2
127+
* query 3: localHost3, localHost1, localHost2, host1InRemoteDc1, host2InRemoteDc1, host1InRemoteDc2, host2InRemoteDc2
128+
129+
Hosts selected by this option are at distance `REMOTE`. Note that they always appear in the same order.
130+
131+
Finally, `allowRemoteDCsForLocalConsistencyLevel` controls whether remote hosts included by the previous option are
132+
included when the consistency level of the query is `LOCAL_ONE` or `LOCAL_QUORUM`. By default, it is off (remote hosts
133+
are not included for local CLs).
134+
135+
136+
### [TokenAwarePolicy]
137+
138+
```java
139+
Cluster cluster = Cluster.builder()
140+
.addContactPoint("127.0.0.1")
141+
.withLoadBalancingPolicy(new TokenAwarePolicy(anotherPolicy))
142+
.build();
143+
```
144+
145+
This policy adds **token awareness** on top of another policy: requests will be routed in priority to the local replicas
146+
that own the data that is being queried.
147+
148+
#### Requirements
149+
150+
In order for token awareness to work, you should first ensure that metadata is enabled in the driver. That is the case
151+
by default, unless it's been explicitly disabled by [QueryOptions#setMetadataEnabled][setMetadataEnabled].
152+
153+
Then you need to consider whether routing information (provided by [Statement#getKeyspace] and
154+
[Statement#getRoutingKey]) can be computed automatically for your statements; if not, you may provide it yourself (if a
155+
statement has no routing information, the query will still be executed, but token awareness will not work, so the driver
156+
might not pick the best coordinator).
157+
158+
The examples assume the following CQL schema:
159+
160+
```
161+
CREATE TABLE testKs.sensor_data(id int, year int, ts timestamp, data double,
162+
PRIMARY KEY ((id, year), ts));
163+
```
164+
165+
For [simple statements](../statements/simple/), routing information can never be computed automatically:
166+
167+
```java
168+
SimpleStatement statement = new SimpleStatement(
169+
"SELECT * FROM testKs.sensor_data WHERE id = 1 and year = 2016");
170+
171+
// No routing info available:
172+
assert statement.getKeyspace() == null;
173+
assert statement.getRoutingKey() == null;
174+
175+
// Set the keyspace manually:
176+
statement.setKeyspace("testKs");
177+
178+
// Set the routing key manually: serialize each partition key component to its target CQL type
179+
ProtocolVersion protocolVersion = cluster.getConfiguration().getProtocolOptions().getProtocolVersionEnum();
180+
statement.setRoutingKey(
181+
DataType.cint().serialize(1, protocolVersion),
182+
DataType.cint().serialize(2016, protocolVersion));
183+
184+
session.execute(statement);
185+
```
186+
187+
For [built statements](../statements/built/), the keyspace is available if it was provided while building the query; the
188+
routing key is available only if the statement was built using the table metadata, and all components of the partition
189+
key appear in the query:
190+
191+
```java
192+
TableMetadata tableMetadata = cluster.getMetadata()
193+
.getKeyspace("testKs")
194+
.getTable("sensor_data");
195+
196+
// Built from metadata: all info available
197+
BuiltStatement statement1 = select().from(tableMetadata)
198+
.where(eq("id", 1))
199+
.and(eq("year", 2016));
200+
201+
assert statement1.getKeyspace() != null;
202+
assert statement1.getRoutingKey() != null;
203+
204+
// Built from keyspace and table name: only keyspace available
205+
BuiltStatement statement2 = select().from("testKs", "sensor")
206+
.where(eq("id", 1))
207+
.and(eq("year", 2016));
208+
209+
assert statement2.getKeyspace() != null;
210+
assert statement2.getRoutingKey() == null;
211+
```
212+
213+
For [bound statements](../statements/prepared/), the keyspace is always available; the routing key is only available if
214+
all components of the partition key are bound as variables:
215+
216+
```java
217+
// All components bound: all info available
218+
PreparedStatement pst1 = session.prepare("SELECT * FROM testKs.sensor_data WHERE id = :id and year = :year");
219+
BoundStatement statement1 = pst1.bind(1, 2016);
220+
221+
assert statement1.getKeyspace() != null;
222+
assert statement1.getRoutingKey() != null;
223+
224+
// 'id' hard-coded, only 'year' is bound: only keyspace available
225+
PreparedStatement pst2 = session.prepare("SELECT * FROM testKs.sensor_data WHERE id = 1 and year = :year");
226+
BoundStatement statement2 = pst2.bind(2016);
227+
228+
assert statement2.getKeyspace() != null;
229+
assert statement2.getRoutingKey() == null;
230+
```
231+
232+
For [batch statements](../statements/batch/), the routing information of each child statement is inspected; the first
233+
non-null keyspace is used as the keyspace of the batch, and the first non-null routing key as its routing key (the idea
234+
is that all childs should have the same routing information, since batches are supposed to operate on a single
235+
partition). All children might have null information, in which case you need to provide the information manually as
236+
shown previously.
237+
238+
#### Behavior
239+
240+
For any host, the distance returned by `TokenAwarePolicy` is always the same as its child policy.
241+
242+
When the policy computes a query plan, it will first inspect the statement's routing information. If there is none, the
243+
policy simply acts as a pass-through, and returns the query plan computed by its child policy.
244+
245+
If the statement has routing information, the policy uses it to determine the replicas that hold the corresponding data.
246+
Then it returns a query plan containing:
247+
248+
1. the replicas for which the child policy returns distance `LOCAL`, shuffled in a random order;
249+
2. followed by the query plan of the child policy, skipping any host that were already returned by the previous step.
250+
251+
Finally, the `shuffleReplicas` constructor parameter allows you to control whether the policy shuffles the replicas in
252+
step 1:
253+
254+
```java
255+
new TokenAwarePolicy(anotherPolicy, false); // no shuffling
256+
```
257+
258+
Shuffling will distribute writes better, and can alleviate hotspots caused by "fat" partitions. On the other hand,
259+
setting it to `false` might increase the effectiveness of caching, since data will always be retrieved from the
260+
"primary" replica. Shuffling is enabled by default.
261+
262+
### [LatencyAwarePolicy]
263+
264+
```java
265+
Cluster cluster = Cluster.builder()
266+
.addContactPoint("127.0.0.1")
267+
.withLoadBalancingPolicy(
268+
LatencyAwarePolicy.builder(anotherPolicy)
269+
.withExclusionThreshold(2.0)
270+
.withScale(100, TimeUnit.MILLISECONDS)
271+
.withRetryPeriod(10, TimeUnit.SECONDS)
272+
.withUpdateRate(100, TimeUnit.MILLISECONDS)
273+
.withMininumMeasurements(50)
274+
.build()
275+
).build();
276+
```
277+
278+
This policy adds **latency awareness** on top of another policy: it collects the latencies of queries to each host, and
279+
will exclude the worst-performing hosts from query plans.
280+
281+
The builder allow you to customize various aspects of the policy:
282+
283+
* the [exclusion threshold][withExclusionThreshold] controls how much worse a host must perform (compared to the fastest
284+
host) in order to be excluded. For example, 2 means that hosts that are twice slower will be excluded;
285+
* since a host's performance can vary over time, its score is computed with a time-weighted average; the
286+
[scale][withScale] controls how fast the weight given to older latencies decreases over time;
287+
* the [retry period][withRetryPeriod] is the duration for which a slow host will be penalized;
288+
* the [update rate][withUpdateRate] defines how often the minimum average latency (i.e. the fastest host) is recomputed;
289+
* the [minimum measurements][withMininumMeasurements] threshold guarantees that we have enough measurements before we
290+
start excluding a host. This prevents skewing the measurements during a node restart, where JVM warm-up will influence
291+
latencies.
292+
293+
For any host, the distance returned by the policy is always the same as its child policy.
294+
295+
Query plans are based on the child policy's, except that hosts that are currently excluded for being too slow are moved
296+
to the end of the plan.
297+
298+
[withExclusionThreshold]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withExclusionThreshold-double-
299+
[withMininumMeasurements]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withMininumMeasurements-int-
300+
[withRetryPeriod]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withRetryPeriod-long-java.util.concurrent.TimeUnit-
301+
[withScale]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withScale-long-java.util.concurrent.TimeUnit-
302+
[withUpdateRate]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.Builder.html#withUpdateRate-long-java.util.concurrent.TimeUnit-
303+
304+
### Filtering policies
305+
306+
[WhiteListPolicy] wraps another policy with a white list, to ensure that the driver will only ever connect to a
307+
pre-defined subset of the cluster. The distance will be that of the child policy for hosts that are in the white list,
308+
and `IGNORED` otherwise. Query plans are guaranteed to only contain white-listed hosts.
309+
310+
[HostFilterPolicy] is a generalization of that concept, where you provide the predicate that will determine if a host is
311+
included or not.
312+
313+
### Implementing your own
314+
315+
If none of the provided policies fit your use case, you can write your own. This is an advanced topic, so we recommend
316+
studying the existing implementations first: `RoundRobinPolicy` is a good place to start, then you can look at more
317+
complex ones like `DCAwareRoundRobinPolicy`.
318+
319+
320+
[LoadBalancingPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LoadBalancingPolicy.html
321+
[RoundRobinPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/RoundRobinPolicy.html
322+
[DCAwareRoundRobinPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/DCAwareRoundRobinPolicy.html
323+
[TokenAwarePolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/TokenAwarePolicy.html
324+
[LatencyAwarePolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LatencyAwarePolicy.html
325+
[HostFilterPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/HostFilterPolicy.html
326+
[WhiteListPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/WhiteListPolicy.html
327+
[HostDistance]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/HostDistance.html
328+
[refreshConnectedHosts]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/PoolingOptions.html#refreshConnectedHosts--
329+
[setMetadataEnabled]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/QueryOptions.html#setMetadataEnabled-boolean-
330+
[Statement#getKeyspace]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/Statement.html#getKeyspace--
331+
[Statement#getRoutingKey]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/Statement.html#getRoutingKey--

manual/statements/README.md

+5-9
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,15 @@
33
To execute a query, you create a [Statement] instance and pass it to
44
`Session#execute()`. The driver provides various implementations:
55

6-
* [SimpleStatement]: a simple implementation built directly from a
6+
* [SimpleStatement](simple/): a simple implementation built directly from a
77
character string. Typically used for queries that are executed only
88
once or a few times.
9-
* [BoundStatement]: obtained by binding values to a [prepared
10-
statement](prepared/). Typically used for queries that are executed
9+
* [BoundStatement](prepared/): obtained by binding values to a prepared
10+
statement. Typically used for queries that are executed
1111
often, with different values.
12-
* [BuiltStatement]: a statement built with the [QueryBuilder] DSL. It
12+
* [BuiltStatement](built/): a statement built with the [QueryBuilder] DSL. It
1313
can be executed directly like a simple statement, or prepared.
14-
* [BatchStatement]: a statement that groups multiple statements to be
14+
* [BatchStatement](batch/): a statement that groups multiple statements to be
1515
executed as a batch.
1616

1717
`Session` also has a shortcut to build and execute a simple statement in
@@ -43,11 +43,7 @@ properties that influence statement execution. To achieve this, you can
4343
wrap your statements in a custom [StatementWrapper] implementation.
4444

4545
[Statement]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/Statement.html
46-
[SimpleStatement]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/SimpleStatement.html
47-
[BoundStatement]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/BoundStatement.html
48-
[BatchStatement]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/BatchStatement.html
4946
[QueryBuilder]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/querybuilder/QueryBuilder.html
50-
[BuiltStatement]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/querybuilder/BuiltStatement.html
5147
[StatementWrapper]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/StatementWrapper.html
5248
[RetryPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/RetryPolicy.html
5349
[LoadBalancingPolicy]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/LoadBalancingPolicy.html

manual/statements/batch/README.md

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
## Batch statements
2+
3+
*Coming soon... In the meantime, see the javadoc for [BatchStatement].*
4+
5+
[BatchStatement]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/BatchStatement.html

manual/statements/built/README.md

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
## Built statements
2+
3+
*Coming soon... In the meantime, see the javadoc for [QueryBuilder].*
4+
5+
[QueryBuilder]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/querybuilder/QueryBuilder.html

manual/statements/simple/README.md

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
## Simple statements
2+
3+
*Coming soon... In the meantime, see the javadoc for [SimpleStatement].*
4+
5+
[SimpleStatement]: http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/SimpleStatement.html

0 commit comments

Comments
 (0)