title | summary |
---|---|
Best Practices for Read-Only Storage Nodes |
This document introduces configuring read-only storage nodes for isolating high-tolerance delay loads from online services. Steps include marking TiKV nodes as read-only, using Placement Rules to store data on read-only nodes as learners, and using Follower Read to read data from read-only nodes. |
This document introduces how to configure read-only storage nodes and how to direct backup, analysis, testing, and other traffic to these nodes. In this way, loads with high tolerance for delay can be physically isolated from important online services.
To specify some TiKV nodes as read-only, you can mark these nodes with a special label (use $
as the prefix of the label key). Unless you explicitly specify these nodes to store some data using Placement Rules, PD does not schedule any data to these nodes.
You can configure a read-only node by running the tiup cluster edit-config
command:
tikv_servers:
- host: ...
...
labels:
$mode: readonly
-
Run the
pd-ctl config placement-rules
command to export the default Placement Rules:pd-ctl config placement-rules rule-bundle load --out="rules.json"
If you have not configured Placement Rules before, the output is as follows:
[ { "group_id": "pd", "group_index": 0, "group_override": false, "rules": [ { "group_id": "pd", "id": "default", "start_key": "", "end_key": "", "role": "voter", "count": 3 } ] } ]
-
Store all data on the read-only nodes as a learner. The following example is based on the default configuration:
[ { "group_id": "pd", "group_index": 0, "group_override": false, "rules": [ { "group_id": "pd", "id": "default", "start_key": "", "end_key": "", "role": "voter", "count": 3 }, { "group_id": "pd", "id": "readonly", "start_key": "", "end_key": "", "role": "learner", "count": 1, "label_constraints": [ { "key": "$mode", "op": "in", "values": [ "readonly" ] } ], "version": 1 } ] } ]
-
Use the
pd-ctl config placement-rules
command to write the preceding configurations to PD:pd-ctl config placement-rules rule-bundle save --in="rules.json"
Note:
- If you perform the preceding operations on a cluster with a large dataset, the entire cluster might need some time to completely replicate data to read-only nodes. During this period, the read-only nodes might not be able to provide services.
- Because of the special implementation of backup, the learner number of each label cannot exceed 1. Otherwise, it will generate duplicate data during backup.
To read data from read-only nodes when using TiDB, you can set the system variable tidb_replica_read
to learner
:
set tidb_replica_read=learner;
To read data from read-only nodes when using TiSpark, you can set the configuration item spark.tispark.replica_read
to learner
in the Spark configuration file:
spark.tispark.replica_read learner
To read data from read-only nodes when backing up cluster data, you can specify the --replica-read-label
option in the br command line. Note that when running the following command in shell, you need to use single quotes to wrap the label to prevent $
from being parsed.
tiup br backup full ... --replica-read-label '$mode:readonly'