title | summary |
---|---|
Placement Rules in SQL |
Learn how to schedule placement of tables and partitions using SQL statements. |
Warning:
Placement Rules in SQL is an experimental feature introduced in v5.3.0. The syntax might change before its GA, and there might also be bugs. If you understand the risks, you can enable this experiment feature by executing
SET GLOBAL tidb_enable_alter_placement = 1;
.
Placement Rules in SQL is a feature that enables you to specify where data is stored in a TiKV cluster using SQL interfaces. Using this feature, tables and partitions are scheduled to specific regions, data centers, racks, or hosts. This is useful for scenarios including optimizing a high availability strategy with lower cost, ensuring that local replicas of data are available for local stale reads, and adhering to data locality requirements.
The detailed user scenarios are as follows:
- Merge multiple databases of different applications to reduce the cost on database maintenance
- Increase replica count for important data to improve the application availability and data reliability
- Store new data into SSDs and store old data into HHDs to lower the cost on data archiving and storage
- Schedule the leaders of hotspot data to high-performance TiKV instances
- Separate cold data to lower-cost storage mediums to improve cost efficiency
To use Placement Rules in SQL, you need to specify one or more placement options in a SQL statement. To specify the Placement options, you can either use direct placement or use a placement policy.
In the following example, both tables t1
and t2
have the same rules. t1
is specified rules using a direct placement while t2
is specified rules using a placement policy.
CREATE TABLE t1 (a INT) PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-west-1";
CREATE PLACEMENT POLICY eastandwest PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-west-1";
CREATE TABLE t2 (a INT) PLACEMENT POLICY=eastandwest;
It is recommended to use placement policies for simpler rule management. When you change a placement policy (via ALTER PLACEMENT POLICY
), the change automatically propagates to all database objects.
If you use direct placement options, you have to alter rules for each object (for example, tables and partitions).
PLACEMENT POLICY
is not associated with any database schema and has the global scope. Therefore, assigning a placement policy does not require any additional privileges over the CREATE TABLE
privilege.
Note:
Placement options depend on labels correctly specified in the configuration of each TiKV node. For example, the
PRIMARY_REGION
option depends on theregion
label in TiKV. To see a summary of all labels available in your TiKV cluster, use the statementSHOW PLACEMENT LABELS
:mysql> show placement labels; +--------+----------------+ | Key | Values | +--------+----------------+ | disk | ["ssd"] | | region | ["us-east-1"] | | zone | ["us-east-1a"] | +--------+----------------+ 3 rows in set (0.00 sec)
Option Name | Description |
---|---|
PRIMARY_REGION |
Raft leaders are placed in stores that have the region label that matches the value of this option. |
REGIONS |
Raft followers are placed in stores that have the region label that matches the value of this option. |
SCHEDULE |
The strategy used to schedule the placement of followers. The value options are EVEN (default) or MAJORITY_IN_PRIMARY . |
FOLLOWERS |
The number of followers. For example, FOLLOWERS=2 means that there will be 3 replicas of the data (2 followers and 1 leader). |
In addition to the placement options above, you can also use the advance configurations. For details, see Advance placement.
Option Name | Description |
---|---|
CONSTRAINTS |
A list of constraints that apply to all roles. For example, CONSTRAINTS="[+disk=ssd] . |
FOLLOWER_CONSTRAINTS |
A list of constraints that only apply to followers. |
The default configuration of max-replicas
is 3
. To increase this for a specific set of tables, you can use a placement policy as follows:
CREATE PLACEMENT POLICY fivereplicas FOLLOWERS=4;
CREATE TABLE t1 (a INT) PLACEMENT POLICY=fivereplicas;
Note that the PD configuration includes the leader and follower count, thus 4 followers + 1 leader equals 5 replicas in total.
To expand on this example, you can also use PRIMARY_REGION
and REGIONS
placement options to describe the placement for the followers:
CREATE PLACEMENT POLICY eastandwest PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-east-2,us-west-1" SCHEDULE="MAJORITY_IN_PRIMARY" FOLLOWERS=4;
CREATE TABLE t1 (a INT) PLACEMENT POLICY=eastandwest;
The SCHEDULE
option instructs TiDB on how to balance the followers. The default schedule of EVEN
ensures a balance of followers in all regions.
To ensure that enough followers are placed in the primary region (us-east-1
) so that quorum can be achieved, you can use the MAJORITY_IN_PRIMARY
schedule. This schedule helps provide lower latency transactions at the expense of some availability. If the primary region fails, MAJORITY_IN_PRIMARY
cannot provide automatic failover.
Note:
The following example uses list partitioning, which is currently an experimental feature of TiDB. Partitioned tables also require the
PRIMARY KEY
to be included in all columns in the table's partitioning function.
In addition to assigning placement options to tables, you can also assign the options to table partitions. For example:
CREATE PLACEMENT POLICY europe PRIMARY_REGION="eu-central-1" REGIONS="eu-central-1,eu-west-1";
CREATE PLACEMENT POLICY northamerica PRIMARY_REGION="us-east-1" REGIONS="us-east-1";
SET tidb_enable_list_partition = 1;
CREATE TABLE t1 (
country VARCHAR(10) NOT NULL,
userdata VARCHAR(100) NOT NULL
) PARTITION BY LIST COLUMNS (country) (
PARTITION pEurope VALUES IN ('DE', 'FR', 'GB') PLACEMENT POLICY=europe,
PARTITION pNorthAmerica VALUES IN ('US', 'CA', 'MX') PLACEMENT POLICY=northamerica
);
You can directly attach the default placement options to a database schema. This works similar to setting the default character set or collation for a schema. Your specified placement options apply when no other options are specified. For example:
CREATE TABLE t1 (a INT); -- Creates a table t1 with no placement options.
ALTER DATABASE test FOLLOWERS=4; -- Changes the default placement option, and does not apply to the existing table t1.
CREATE TABLE t2 (a INT); -- Creates a table t2 with the default placement of FOLLOWERS=4.
CREATE TABLE t3 (a INT) PRIMARY_REGION="us-east-1" REGIONS="us-east-1,us-east-2"; -- Creates a table t3 without the default FOLLOWERS=4 placement, because this statement has specified another placement.
ALTER DATABASE test FOLLOWERS=2; -- Changes the default placement, and does not apply to existing tables.
CREATE TABLE t4 (a INT); -- Creates a table t4 with the default FOLLOWERS=2 option.
Because placement options are only inherited from the database schema when a table is created, it is recommended to set the default placement option using a PLACEMENT POLICY
. This ensures that future changes to the policy propagate to existing tables.
The placement options PRIMARY_REGION
, REGIONS
, and SCHEDULE
meet the basic needs of data placement at the loss of some flexibility. For more complex scenarios with the need for higher flexibility, you can also use the advanced placement options of CONSTRAINTS
and FOLLOWER_CONSTRAINTS
. You cannot specify the PRIMARY_REGION
, REGIONS
, or SCHEDULE
option with the CONSTRAINTS
option at the same time. If you specify both at the same time, an error will be returned.
For example, to set constraints that data must reside on a TiKV store where the label disk
must match a value:
CREATE PLACEMENT POLICY storeonfastssd CONSTRAINTS="[+disk=ssd]";
CREATE PLACEMENT POLICY storeonhdd CONSTRAINTS="[+disk=hdd]";
CREATE PLACEMENT POLICY companystandardpolicy CONSTRAINTS="";
CREATE TABLE t1 (id INT, name VARCHAR(50), purchased DATE)
PLACEMENT POLICY=companystandardpolicy
PARTITION BY RANGE( YEAR(purchased) ) (
PARTITION p0 VALUES LESS THAN (2000) PLACEMENT POLICY=storeonhdd,
PARTITION p1 VALUES LESS THAN (2005),
PARTITION p2 VALUES LESS THAN (2010),
PARTITION p3 VALUES LESS THAN (2015),
PARTITION p4 VALUES LESS THAN MAXVALUE PLACEMENT POLICY=storeonfastssd
);
You can either specify constraints in list format ([+disk=ssd]
) or in dictionary format ({+disk=ssd: 1,+disk=hdd: 2}
).
In list format, constraints are specified as a list of key-value pairs. The key starts with either a +
or a -
. +disk=ssd
indicates that the label disk
must be set to ssd
, and -disk=hdd
indicates that the label disk
must not be hdd
.
In dictionary format, constraints also indicate a number of instances that apply to that rule. For example, FOLLOWER_CONSTRAINTS="{+region=us-east-1: 1,+region=us-east-2: 1,+region=us-west-1: 1,+any: 1}";
indicates that 1 follower is in us-east-1, 1 follower is in us-east-2, 1 follower is in us-west-1, and 1 follower can be in any region. For another example, FOLLOWER_CONSTRAINTS='{"+region=us-east-1,+disk=hdd":1,"+region=us-west-1":1}';
indicates that 1 follower is in us-east-1 with an hdd disk, and 1 follower is in us-west-1.
Note:
Dictionary and list formats are based on the YAML parser, but the YAML syntax might be incorrectly parsed. For example,
"{+disk=ssd:1,+disk=hdd:2}"
is incorrectly parsed as'{"+disk=ssd:1": null, "+disk=hdd:1": null}'
. But"{+disk=ssd: 1,+disk=hdd: 1}"
is correctly parsed as'{"+disk=ssd": 1, "+disk=hdd": 1}'
.
The following known limitations exist in the experimental release of Placement Rules in SQL:
- Dumpling does not support dumping placement policies. See issue #29371.
- TiDB tools, including Backup & Restore (BR), TiCDC, TiDB Lightning, and TiDB Data Migration (DM), do not yet support placement rules.
- Temporary tables do not support placement options (either via direct placement or placement policies).
- Syntactic sugar rules are permitted for setting
PRIMARY_REGION
andREGIONS
. In the future, we plan to add varieties forPRIMARY_RACK
,PRIMARY_ZONE
, andPRIMARY_HOST
. See issue #18030. - TiFlash learners are not configurable through Placement Rules syntax.
- Placement rules only ensure that data at rest resides on the correct TiKV store. The rules do not guarantee that data in transit (via either user queries or internal operations) only occurs in a specific region.