diff --git a/TOC.md b/TOC.md index 19adfdad2bb76..c844f823b3b9f 100644 --- a/TOC.md +++ b/TOC.md @@ -119,7 +119,6 @@ - [PD Microservices Topology](/pd-microservices-deployment-topology.md) - [TiProxy Topology](/tiproxy/tiproxy-deployment-topology.md) - [TiCDC Topology](/ticdc-deployment-topology.md) - - [TiDB Binlog Topology](/tidb-binlog-deployment-topology.md) - [TiSpark Topology](/tispark-deployment-topology.md) - [Cross-DC Topology](/geo-distributed-deployment-topology.md) - [Hybrid Topology](/hybrid-deployment-topology.md) @@ -604,26 +603,6 @@ - [Troubleshoot](/ticdc/troubleshoot-ticdc.md) - [FAQs](/ticdc/ticdc-faq.md) - [Glossary](/ticdc/ticdc-glossary.md) - - TiDB Binlog (Deprecated) - - [Overview](/tidb-binlog/tidb-binlog-overview.md) - - [Quick Start](/tidb-binlog/get-started-with-tidb-binlog.md) - - [Deploy](/tidb-binlog/deploy-tidb-binlog.md) - - [Maintain](/tidb-binlog/maintain-tidb-binlog-cluster.md) - - [Configure](/tidb-binlog/tidb-binlog-configuration-file.md) - - [Pump](/tidb-binlog/tidb-binlog-configuration-file.md#pump) - - [Drainer](/tidb-binlog/tidb-binlog-configuration-file.md#drainer) - - [Upgrade](/tidb-binlog/upgrade-tidb-binlog.md) - - [Monitor](/tidb-binlog/monitor-tidb-binlog-cluster.md) - - [Reparo](/tidb-binlog/tidb-binlog-reparo.md) - - [binlogctl](/tidb-binlog/binlog-control.md) - - [Binlog Consumer Client](/tidb-binlog/binlog-consumer-client.md) - - [TiDB Binlog Relay Log](/tidb-binlog/tidb-binlog-relay-log.md) - - [Bidirectional Replication Between TiDB Clusters](/tidb-binlog/bidirectional-replication-between-tidb-clusters.md) - - [Glossary](/tidb-binlog/tidb-binlog-glossary.md) - - Troubleshoot - - [Troubleshoot](/tidb-binlog/troubleshoot-tidb-binlog.md) - - [Handle Errors](/tidb-binlog/handle-tidb-binlog-errors.md) - - [FAQ](/tidb-binlog/tidb-binlog-faq.md) - PingCAP Clinic Diagnostic Service - [Overview](/clinic/clinic-introduction.md) - [Quick Start](/clinic/quick-start-with-clinic.md) @@ -766,8 +745,6 @@ - [`CALIBRATE RESOURCE`](/sql-statements/sql-statement-calibrate-resource.md) - [`CANCEL IMPORT JOB`](/sql-statements/sql-statement-cancel-import-job.md) - [`COMMIT`](/sql-statements/sql-statement-commit.md) - - [`CHANGE DRAINER`](/sql-statements/sql-statement-change-drainer.md) - - [`CHANGE PUMP`](/sql-statements/sql-statement-change-pump.md) - [`CREATE BINDING`](/sql-statements/sql-statement-create-binding.md) - [`CREATE DATABASE`](/sql-statements/sql-statement-create-database.md) - [`CREATE INDEX`](/sql-statements/sql-statement-create-index.md) @@ -847,7 +824,6 @@ - [`SHOW CREATE TABLE`](/sql-statements/sql-statement-show-create-table.md) - [`SHOW CREATE USER`](/sql-statements/sql-statement-show-create-user.md) - [`SHOW DATABASES`](/sql-statements/sql-statement-show-databases.md) - - [`SHOW DRAINER STATUS`](/sql-statements/sql-statement-show-drainer-status.md) - [`SHOW ENGINES`](/sql-statements/sql-statement-show-engines.md) - [`SHOW ERRORS`](/sql-statements/sql-statement-show-errors.md) - [`SHOW FIELDS FROM`](/sql-statements/sql-statement-show-fields-from.md) @@ -862,7 +838,6 @@ - [`SHOW PRIVILEGES`](/sql-statements/sql-statement-show-privileges.md) - [`SHOW PROCESSLIST`](/sql-statements/sql-statement-show-processlist.md) - [`SHOW PROFILES`](/sql-statements/sql-statement-show-profiles.md) - - [`SHOW PUMP STATUS`](/sql-statements/sql-statement-show-pump-status.md) - [`SHOW SCHEMAS`](/sql-statements/sql-statement-show-schemas.md) - [`SHOW STATS_BUCKETS`](/sql-statements/sql-statement-show-stats-buckets.md) - [`SHOW STATS_HEALTHY`](/sql-statements/sql-statement-show-stats-healthy.md) diff --git a/alert-rules.md b/alert-rules.md index 19ef484f183e6..c4ef8c5a2e3e7 100644 --- a/alert-rules.md +++ b/alert-rules.md @@ -8,7 +8,7 @@ aliases: ['/docs/dev/alert-rules/','/docs/dev/reference/alert-rules/'] # TiDB Cluster Alert Rules -This document describes the alert rules for different components in a TiDB cluster, including the rule descriptions and solutions of the alert items in TiDB, TiKV, PD, TiFlash, TiDB Binlog, TiCDC, Node_exporter and Blackbox_exporter. +This document describes the alert rules for different components in a TiDB cluster, including the rule descriptions and solutions of the alert items in TiDB, TiKV, PD, TiFlash, TiCDC, Node_exporter and Blackbox_exporter. According to the severity level, alert rules are divided into three categories (from high to low): emergency-level, critical-level, and warning-level. This division of severity levels applies to all alert items of each component below. @@ -123,12 +123,11 @@ This section gives the alert rules for the TiDB component. The number of events that happen in the TiDB service. An alert is triggered when the following events happen: 1. start: The TiDB service starts. - 2. hang: When a critical-level event (currently there is only one scenario: TiDB cannot write binlog) happens, TiDB enters the `hang` mode and waits to be killed manually. + 2. hang: When a critical-level event happens, TiDB enters the `hang` mode and waits to be killed manually. * Solution: - * Restart TiDB to recover the service. - * Check whether the TiDB Binlog service is normal. + Restart TiDB to recover the service. #### `TiDB_tikvclient_backoff_seconds_count` @@ -774,10 +773,6 @@ This section gives the alert rules for the TiKV component. For the detailed descriptions of TiFlash alert rules, see [TiFlash Alert Rules](/tiflash/tiflash-alert-rules.md). -## TiDB Binlog alert rules - -For the detailed descriptions of TiDB Binlog alert rules, see [TiDB Binlog monitoring document](/tidb-binlog/monitor-tidb-binlog-cluster.md#alert-rules). - ## TiCDC Alert rules For the detailed descriptions of TiCDC alert rules, see [TiCDC Alert Rules](/ticdc/ticdc-alert-rules.md). @@ -967,38 +962,6 @@ This section gives the alert rules for the Blackbox_exporter TCP, ICMP, and HTTP * Check whether the TiFlash process exists. * Check whether the network between the monitoring machine and the TiFlash machine is normal. -#### `Pump_server_is_down` - -* Alert rule: - - `probe_success{group="pump"} == 0` - -* Description: - - Failure to probe the pump service port. - -* Solution: - - * Check whether the machine that provides the pump service is down. - * Check whether the pump process exists. - * Check whether the network between the monitoring machine and the pump machine is normal. - -#### `Drainer_server_is_down` - -* Alert rule: - - `probe_success{group="drainer"} == 0` - -* Description: - - Failure to probe the Drainer service port. - -* Solution: - - * Check whether the machine that provides the Drainer service is down. - * Check whether the Drainer process exists. - * Check whether the network between the monitoring machine and the Drainer machine is normal. - #### `TiKV_server_is_down` * Alert rule: diff --git a/basic-features.md b/basic-features.md index f64d06fbd0da0..714fff63d6197 100644 --- a/basic-features.md +++ b/basic-features.md @@ -215,7 +215,7 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u | [Dumpling logical dumper](/dumpling-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | [Transactional `LOAD DATA`](/sql-statements/sql-statement-load-data.md) [^5] | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | [Database migration toolkit (DM)](/migration-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | -| [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) [^6] | Deprecated | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | +| [TiDB Binlog](https://docs.pingcap.com/tidb/v8.3/tidb-binlog-overview) [^6] | Deprecated | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | [Change data capture (CDC)](/ticdc/ticdc-overview.md) | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | | [Stream data to Amazon S3, GCS, Azure Blob Storage, and NFS through TiCDC](/ticdc/ticdc-sink-to-cloud-storage.md) | Y | Y | Y | Y | Y | E | N | N | N | N | N | | [TiCDC supports bidirectional replication between two TiDB clusters](/ticdc/ticdc-bidirectional-replication.md) | Y | Y | Y | Y | Y | Y | N | N | N | N | N | @@ -274,4 +274,4 @@ You can try out TiDB features on [TiDB Playground](https://play.tidbcloud.com/?u [^5]: Starting from [TiDB v7.0.0](/releases/release-7.0.0.md), the new parameter `FIELDS DEFINED NULL BY` and support for importing data from S3 and GCS are experimental features. Starting from [v7.6.0](/releases/release-7.6.0.md), TiDB processes `LOAD DATA` in transactions in the same way as MySQL. The `LOAD DATA` statement in a transaction no longer automatically commits the current transaction or starts a new transaction. Moreover, you can explicitly commit or roll back the `LOAD DATA` statement in a transaction. Additionally, the `LOAD DATA` statement is affected by the TiDB transaction mode setting (optimistic or pessimistic transaction). -[^6]: Starting from v7.5.0, [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). +[^6]: Starting from v7.5.0, [TiDB Binlog](https://docs.pingcap.com/tidb/v8.3/tidb-binlog-overview) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated. Starting from v8.4.0, TiDB Binlog is removed. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). diff --git a/binary-package.md b/binary-package.md index bd4f3f135c3ad..25b55cc13f676 100644 --- a/binary-package.md +++ b/binary-package.md @@ -59,12 +59,8 @@ The `TiDB-community-toolkit` package contains the following contents. | errdoc-{version}-linux-{arch}.tar.gz | | | dba-{version}-linux-{arch}.tar.gz | | | PCC-{version}-linux-{arch}.tar.gz | | -| pump-{version}-linux-{arch}.tar.gz | | -| drainer-{version}-linux-{arch}.tar.gz | | -| binlogctl | New in v6.0.0 | | sync_diff_inspector | | | reparo | | -| arbiter | | | server-{version}-linux-{arch}.tar.gz | New in v6.2.0 | | grafana-{version}-linux-{arch}.tar.gz | New in v6.2.0 | | alertmanager-{version}-linux-{arch}.tar.gz | New in v6.2.0 | diff --git a/clustered-indexes.md b/clustered-indexes.md index 31a74827c58fa..b2378948ce401 100644 --- a/clustered-indexes.md +++ b/clustered-indexes.md @@ -147,15 +147,6 @@ Currently, there are several different types of limitations for the clustered in - Downgrading tables with clustered indexes is not supported. If you need to downgrade such tables, use logical backup tools to migrate data instead. - Situations that are not supported yet but in the support plan: - Adding, dropping, and altering clustered indexes using `ALTER TABLE` statements are not supported. -- Limitations for specific versions: - - In v5.0, using the clustered index feature together with TiDB Binlog is not supported. After TiDB Binlog is enabled, TiDB only allows creating a single integer column as the clustered index of a primary key. TiDB Binlog does not replicate data changes (such as insertion, deletion, and update) on existing tables with clustered indexes to the downstream. If you need to replicate tables with clustered indexes to the downstream, upgrade your cluster to v5.1 or use [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) for replication instead. - -After TiDB Binlog is enabled, if the clustered index you create is not a single integer primary key, TiDB returns the following error: - -```sql -mysql> CREATE TABLE t (a VARCHAR(255) PRIMARY KEY CLUSTERED); -ERROR 8200 (HY000): Cannot create clustered index table when the binlog is ON -``` If you use clustered indexes together with the attribute `SHARD_ROW_ID_BITS`, TiDB reports the following error: diff --git a/command-line-flags-for-tidb-configuration.md b/command-line-flags-for-tidb-configuration.md index c56c6b41c7575..0f3b6bbd37f3f 100644 --- a/command-line-flags-for-tidb-configuration.md +++ b/command-line-flags-for-tidb-configuration.md @@ -35,11 +35,6 @@ When you start the TiDB cluster, you can use command-line options or environment - Specifies the `Access-Control-Allow-Origin` value for Cross-Origin Request Sharing (CORS) request of the TiDB HTTP status service - Default: `""` -## `--enable-binlog` - -+ Enables or disables TiDB binlog generation -+ Default: `false` - ## `--host` - The host address that the TiDB server monitors diff --git a/credits.md b/credits.md index 55cdcf23b9e48..4948c9239decd 100644 --- a/credits.md +++ b/credits.md @@ -17,7 +17,6 @@ TiDB developers contribute to new feature development, performance improvement, - [pingcap/tiflash](https://github.com/pingcap/tiflash/graphs/contributors) - [pingcap/tidb-operator](https://github.com/pingcap/tidb-operator/graphs/contributors) - [pingcap/tiup](https://github.com/pingcap/tiup/graphs/contributors) -- [pingcap/tidb-binlog](https://github.com/pingcap/tidb-binlog/graphs/contributors) - [pingcap/tidb-dashboard](https://github.com/pingcap/tidb-dashboard/graphs/contributors) - [pingcap/tiflow](https://github.com/pingcap/tiflow/graphs/contributors) - [pingcap/tidb-tools](https://github.com/pingcap/tidb-tools/graphs/contributors) diff --git a/download-ecosystem-tools.md b/download-ecosystem-tools.md index f9ba8e4869119..a8d576e274c61 100644 --- a/download-ecosystem-tools.md +++ b/download-ecosystem-tools.md @@ -45,7 +45,6 @@ Depending on which tools you want to use, you can install the corresponding offl | [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) | `tidb-lightning-ctl`
`tidb-lightning-{version}-linux-{arch}.tar.gz` | | [TiDB Data Migration (DM)](/dm/dm-overview.md) | `dm-worker-{version}-linux-{arch}.tar.gz`
`dm-master-{version}-linux-{arch}.tar.gz`
`dmctl-{version}-linux-{arch}.tar.gz` | | [TiCDC](/ticdc/ticdc-overview.md) | `cdc-{version}-linux-{arch}.tar.gz` | -| [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) | `pump-{version}-linux-{arch}.tar.gz`
`drainer-{version}-linux-{arch}.tar.gz`
`binlogctl`
`reparo` | | [Backup & Restore (BR)](/br/backup-and-restore-overview.md) | `br-{version}-linux-{arch}.tar.gz` | | [sync-diff-inspector](/sync-diff-inspector/sync-diff-inspector-overview.md) | `sync_diff_inspector` | | [PD Recover](/pd-recover.md) | `pd-recover-{version}-linux-{arch}.tar` | diff --git a/ecosystem-tool-user-guide.md b/ecosystem-tool-user-guide.md index 49e2d3312fdcd..4020707d49401 100644 --- a/ecosystem-tool-user-guide.md +++ b/ecosystem-tool-user-guide.md @@ -123,21 +123,6 @@ The following are the basics of TiCDC: - Target: TiDB clusters, MySQL, Kafka, and Confluent - Supported TiDB versions: v4.0.6 and later versions -### Incremental log replication - TiDB Binlog - -[TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) is a tool that collects binlog for TiDB clusters and provides nearly real-time data replication and backup. You can use it for incremental data replication between TiDB clusters, such as making a TiDB cluster the secondary cluster of the primary TiDB cluster. - -> **Warning:** -> -> Starting from v7.5.0, [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). - -The following are the basics of TiDB Binlog: - -- Source: TiDB clusters -- Target: TiDB clusters, MySQL, Kafka, or incremental backup files -- Supported TiDB versions: v2.1 and later versions -- Kubernetes support: Yes. See [TiDB Binlog Cluster Operations](https://docs.pingcap.com/tidb-in-kubernetes/stable/deploy-tidb-binlog) and [TiDB Binlog Drainer Configurations on Kubernetes](https://docs.pingcap.com/tidb-in-kubernetes/stable/configure-tidb-binlog-drainer) for details. - ### sync-diff-inspector [sync-diff-inspector](/sync-diff-inspector/sync-diff-inspector-overview.md) is a tool that compares data stored in the MySQL or TiDB databases. In addition, you can also use sync-diff-inspector to repair data in the scenario where a small amount of data is inconsistent. diff --git a/faq/backup-and-restore-faq.md b/faq/backup-and-restore-faq.md index 936c5f4dda95c..3055c24ac4006 100644 --- a/faq/backup-and-restore-faq.md +++ b/faq/backup-and-restore-faq.md @@ -110,13 +110,13 @@ To address this problem, delete the current task using `br log stop`, and then c ## Feature compatibility issues -### Why does data restored using br command-line tool cannot be replicated to the upstream cluster of TiCDC or Drainer? +### Why does data restored using br command-line tool cannot be replicated to the upstream cluster of TiCDC? + **The data restored using BR cannot be replicated to the downstream**. This is because BR directly imports SST files but the downstream cluster currently cannot obtain these files from the upstream. -+ Before v4.0.3, DDL jobs generated during the restore might cause unexpected DDL executions in TiCDC/Drainer. Therefore, if you need to perform restore on the upstream cluster of TiCDC/Drainer, add all tables restored using br command-line tool to the TiCDC/Drainer block list. ++ Before v4.0.3, DDL jobs generated during the restore might cause unexpected DDL executions in TiCDC. Therefore, if you need to perform restore on the upstream cluster of TiCDC, add all tables restored using br command-line tool to the TiCDC block list. -You can use [`filter.rules`](https://github.com/pingcap/tiflow/blob/7c3c2336f98153326912f3cf6ea2fbb7bcc4a20c/cmd/changefeed.toml#L16) to configure the block list for TiCDC and use [`syncer.ignore-table`](/tidb-binlog/tidb-binlog-configuration-file.md#ignore-table) to configure the block list for Drainer. +You can use [`filter.rules`](https://github.com/pingcap/tiflow/blob/7c3c2336f98153326912f3cf6ea2fbb7bcc4a20c/cmd/changefeed.toml#L16) to configure the block list for TiCDC. ### Why is `new_collation_enabled` mismatch reported during restore? diff --git a/faq/deploy-and-maintain-faq.md b/faq/deploy-and-maintain-faq.md index 87ece55c0dfaf..c15264d8e8ddf 100644 --- a/faq/deploy-and-maintain-faq.md +++ b/faq/deploy-and-maintain-faq.md @@ -27,7 +27,7 @@ If the resources are adequate, it is recommended to use RAID 10 for SSD. If the ### What's the recommended configuration of TiDB components? -- TiDB has a high requirement on CPU and memory. If you need to enable TiDB Binlog (deprecated), the local disk space should be increased based on the service volume estimation and the time requirement for the GC operation. But the SSD disk is not a must. +- TiDB has a high requirement on CPU and memory. - PD stores the cluster metadata and has frequent Read and Write requests. It demands a high I/O disk. A disk of low performance will affect the performance of the whole cluster. It is recommended to use SSD disks. In addition, a larger number of Regions has a higher requirement on CPU and memory. - TiKV has a high requirement on CPU, memory and disk. It is required to use SSD. @@ -70,8 +70,6 @@ Check the time difference between the machine time of the monitor and the time w | `enable_ntpd` | to monitor the NTP service of the managed node, True by default; do not close it | | `machine_benchmark` | to monitor the disk IOPS of the managed node, True by default; do not close it | | `set_hostname` | to edit the hostname of the managed node based on the IP, False by default | -| `enable_binlog` | whether to deploy Pump and enable the binlog, False by default, dependent on the Kafka cluster; see the `zookeeper_addrs` variable | -| `zookeeper_addrs` | the ZooKeeper address of the binlog Kafka cluster | | `enable_slow_query_log` | to record the slow query log of TiDB into a single file: ({{ deploy_dir }}/log/tidb_slow_query.log). False by default, to record it into the TiDB log | | `deploy_without_tidb` | the Key-Value mode, deploy only PD, TiKV and the monitoring service, not TiDB; set the IP of the tidb_servers host group to null in the `inventory.ini` file | diff --git a/faq/faq-overview.md b/faq/faq-overview.md index 242079f70da94..53dbc97d0ff71 100644 --- a/faq/faq-overview.md +++ b/faq/faq-overview.md @@ -37,7 +37,6 @@ This document summarizes frequently asked questions (FAQs) about TiDB.
  • Incremental data replication
  • diff --git a/faq/manage-cluster-faq.md b/faq/manage-cluster-faq.md index 30e4120e9b4de..7d18201c0ea34 100644 --- a/faq/manage-cluster-faq.md +++ b/faq/manage-cluster-faq.md @@ -128,10 +128,7 @@ Two reasons: ### Why does the transaction not use the Async Commit or the one-phase commit feature? -In the following situations, even you have enabled the [Async Commit](/system-variables.md#tidb_enable_async_commit-new-in-v50) feature and the [one-phase commit](/system-variables.md#tidb_enable_1pc-new-in-v50) feature using the system variables, TiDB will not use these features: - -- If you have enabled TiDB Binlog, restricted by the implementation of TiDB Binlog, TiDB does not use the Async Commit or one-phase commit feature. -- TiDB uses the Async Commit or one-phase commit features only when no more than 256 key-value pairs are written in the transaction and the total size of keys is no more than 4 KB. This is because, for transactions with a large amount of data to write, using Async Commit cannot greatly improve the performance. +TiDB uses the Async Commit or one-phase commit features only when no more than 256 key-value pairs are written in the transaction and the total size of keys is no more than 4 KB. Otherwise, even you have enabled the [Async Commit](/system-variables.md#tidb_enable_async_commit-new-in-v50) feature and the [one-phase commit](/system-variables.md#tidb_enable_1pc-new-in-v50) feature using the system variables, TiDB will not use these features. This is because, for transactions with a large amount of data to write, using Async Commit cannot greatly improve the performance. ## PD management diff --git a/faq/migration-tidb-faq.md b/faq/migration-tidb-faq.md index 3d291fbe6569a..df3bc71a5c9dd 100644 --- a/faq/migration-tidb-faq.md +++ b/faq/migration-tidb-faq.md @@ -10,7 +10,6 @@ This document summarizes the frequently asked questions (FAQs) related to TiDB d For the frequently asked questions about migration-related tools, click the corresponding links in the list below: - [Backup & Restore FAQs](/faq/backup-and-restore-faq.md) -- [TiDB Binlog FAQ](/tidb-binlog/tidb-binlog-faq.md) - [TiDB Lightning FAQs](/tidb-lightning/tidb-lightning-faq.md) - [TiDB Data Migration (DM) FAQs](/dm/dm-faq.md) - [TiCDC FAQs](/ticdc/ticdc-faq.md) diff --git a/faq/upgrade-faq.md b/faq/upgrade-faq.md index ce6be8527ed2a..d60feef4df6c0 100644 --- a/faq/upgrade-faq.md +++ b/faq/upgrade-faq.md @@ -14,7 +14,7 @@ This section lists some FAQs and their solutions when you upgrade TiDB. ### What are the effects of rolling updates? -When you apply rolling updates to the TiDB services, the running application is affected to varying degrees. Therefore, it is not recommended that you perform a rolling update during business peak hours. You need to configure the minimum cluster topology (TiDB \* 2, PD \* 3, TiKV \* 3). If the Pump or Drainer service is involved in the cluster, it is recommended to stop Drainer before rolling updates. When you upgrade TiDB, Pump is also upgraded. +When you apply rolling updates to the TiDB services, the running application is affected to varying degrees. Therefore, it is not recommended that you perform a rolling update during business peak hours. You need to configure the minimum cluster topology (TiDB \* 2, PD \* 3, TiKV \* 3). ### Can I upgrade the TiDB cluster during the DDL execution? diff --git a/foreign-key.md b/foreign-key.md index 377a39ce065c5..724848ffd64d3 100644 --- a/foreign-key.md +++ b/foreign-key.md @@ -304,7 +304,6 @@ Create Table | CREATE TABLE `child` ( -- [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) does not support foreign keys. - [DM](/dm/dm-overview.md) does not support foreign keys. DM disables the [`foreign_key_checks`](/system-variables.md#foreign_key_checks) of the downstream TiDB when replicating data to TiDB. Therefore, the cascading operations caused by foreign keys are not replicated from the upstream to the downstream, which might cause data inconsistency. - [TiCDC](/ticdc/ticdc-overview.md) v6.6.0 is compatible with foreign keys. The previous versions of TiCDC might report an error when replicating tables with foreign keys. It is recommended to disable the `foreign_key_checks` of the downstream TiDB cluster when using a TiCDC version earlier than v6.6.0. - [BR](/br/backup-and-restore-overview.md) v6.6.0 is compatible with foreign keys. The previous versions of BR might report an error when restoring tables with foreign keys to a v6.6.0 or later cluster. It is recommended to disable the `foreign_key_checks` of the downstream TiDB cluster before restoring the cluster when using a BR earlier than v6.6.0. diff --git a/grafana-tidb-dashboard.md b/grafana-tidb-dashboard.md index 12c67538cdb1b..c470e08a2c2aa 100644 --- a/grafana-tidb-dashboard.md +++ b/grafana-tidb-dashboard.md @@ -53,7 +53,7 @@ To understand the key metrics displayed on the TiDB dashboard, check the followi - Panic And Critical Error: the number of panics and critical errors occurred in TiDB - Time Jump Back OPS: the number of times that the operating system rewinds every second on each TiDB instance - Get Token Duration: the time cost of getting Token on each connection -- Skip Binlog Count: the number of binlog write failures in TiDB +- Skip Binlog Count: the number of binlog write failures in TiDB; starting from v8.4.0, TiDB Binlog is removed, and this metric has no value - Client Data Traffic: data traffic statistics of TiDB and the client ### Transaction diff --git a/hardware-and-software-requirements.md b/hardware-and-software-requirements.md index 1eeba55a6cc70..75e9479d328bf 100644 --- a/hardware-and-software-requirements.md +++ b/hardware-and-software-requirements.md @@ -213,8 +213,6 @@ As an open-source distributed SQL database, TiDB requires the following network | TiFlash | 20170 |the TiFlash Proxy service port | | TiFlash | 20292 | the port for Prometheus to pull TiFlash Proxy metrics | | TiFlash | 8234 | the port for Prometheus to pull TiFlash metrics | -| Pump | 8250 | the Pump communication port | -| Drainer | 8249 | the Drainer communication port | | TiCDC | 8300 | the TiCDC communication port | | Monitoring | 9090 | the communication port for the Prometheus service| | Monitoring | 12020 | the communication port for the NgMonitoring service| diff --git a/identify-slow-queries.md b/identify-slow-queries.md index 9b7c7504eb3c9..0962505d3ea8d 100644 --- a/identify-slow-queries.md +++ b/identify-slow-queries.md @@ -101,7 +101,7 @@ The following fields are related to transaction execution: * `Write_keys`: The count of keys that the transaction writes to the Write CF in TiKV. * `Write_size`: The total size of the keys or values to be written when the transaction commits. * `Prewrite_region`: The number of TiKV Regions involved in the first phase (prewrite) of the two-phase transaction commit. Each Region triggers a remote procedure call. -* `Wait_prewrite_binlog_time`: The time used to write binlogs when a transaction is committed. +* `Wait_prewrite_binlog_time`: The time used to write binlogs when a transaction is committed. Starting from v8.4.0, TiDB Binlog is removed, and this field has no value. * `Resolve_lock_time`: The time to resolve or wait for the lock to be expired after a lock is encountered during a transaction commit. Memory usage fields: diff --git a/information-schema/information-schema-inspection-result.md b/information-schema/information-schema-inspection-result.md index 59a02bcf8acc6..8eee7f6436190 100644 --- a/information-schema/information-schema-inspection-result.md +++ b/information-schema/information-schema-inspection-result.md @@ -263,7 +263,6 @@ In `critical-error` diagnostic rule, the following two diagnostic rules are exec | Component | Error name | Monitoring table | Error description | | ---- | ---- | ---- | ---- | | TiDB | panic-count | tidb_panic_count_total_count | Panic occurs in TiDB. | - | TiDB | binlog-error | tidb_binlog_error_total_count | An error occurs when TiDB writes binlog. | | TiKV | critical-error | tikv_critical_error_total_count | The critical error of TiKV. | | TiKV | scheduler-is-busy | tikv_scheduler_is_busy_total_count | The TiKV scheduler is too busy, which makes TiKV temporarily unavailable. | | TiKV | coprocessor-is-busy | tikv_coprocessor_is_busy_total_count | The TiKV Coprocessor is too busy. | diff --git a/maintain-tidb-using-tiup.md b/maintain-tidb-using-tiup.md index 18e30b89617fe..fad48716104f4 100644 --- a/maintain-tidb-using-tiup.md +++ b/maintain-tidb-using-tiup.md @@ -31,7 +31,7 @@ tiup cluster list The components in the TiDB cluster are started in the following order: -**PD > TiKV > Pump > TiDB > TiFlash > Drainer > TiCDC > Prometheus > Grafana > Alertmanager** +**PD > TiKV > TiDB > TiFlash > TiCDC > Prometheus > Grafana > Alertmanager** To start the cluster, run the following command: @@ -201,7 +201,7 @@ tiup cluster rename ${cluster-name} ${new-name} The components in the TiDB cluster are stopped in the following order (The monitoring component is also stopped): -**Alertmanager > Grafana > Prometheus > TiCDC > Drainer > TiFlash > TiDB > Pump > TiKV > PD** +**Alertmanager > Grafana > Prometheus > TiCDC > TiFlash > TiDB > TiKV > PD** To stop the cluster, run the following command: diff --git a/migration-tools.md b/migration-tools.md index 6101f9b8cdc6c..c81dacf37c267 100644 --- a/migration-tools.md +++ b/migration-tools.md @@ -72,7 +72,7 @@ This document introduces the user scenarios, supported upstreams and downstreams - Suitable for migrating data to another TiDB cluster - Support backing up data to an external storage for disaster recovery - **Limitation**: - - When BR restores data to the upstream cluster of TiCDC or Drainer, the restored data cannot be replicated to the downstream by TiCDC or Drainer. + - When BR restores data to the upstream cluster of TiCDC, the restored data cannot be replicated to the downstream by TiCDC. - BR supports operations only between clusters that have the same `new_collation_enabled` value in the `mysql.tidb` table. ## [sync-diff-inspector](/sync-diff-inspector/sync-diff-inspector-overview.md) diff --git a/placement-rules-in-sql.md b/placement-rules-in-sql.md index cc487746bb00a..468cde46b3796 100644 --- a/placement-rules-in-sql.md +++ b/placement-rules-in-sql.md @@ -471,7 +471,6 @@ After executing the statements in the example, TiDB will place the `app_order` d | Backup & Restore (BR) | 6.0 | Before v6.0, BR does not support backing up and restoring placement policies. For more information, see [Why does an error occur when I restore placement rules to a cluster](/faq/backup-and-restore-faq.md#why-does-an-error-occur-when-i-restore-placement-rules-to-a-cluster). | | TiDB Lightning | Not compatible yet | An error is reported when TiDB Lightning imports backup data that contains placement policies | | TiCDC | 6.0 | Ignores placement policies, and does not replicate the policies to the downstream | -| TiDB Binlog | 6.0 | Ignores placement policies, and does not replicate the policies to the downstream | diff --git a/production-deployment-using-tiup.md b/production-deployment-using-tiup.md index 77d5284431b5c..20176ea7371ca 100644 --- a/production-deployment-using-tiup.md +++ b/production-deployment-using-tiup.md @@ -8,7 +8,7 @@ aliases: ['/docs/dev/production-deployment-using-tiup/','/docs/dev/how-to/deploy [TiUP](https://github.com/pingcap/tiup) is a cluster operation and maintenance tool introduced in TiDB 4.0. TiUP provides [TiUP cluster](https://github.com/pingcap/tiup/tree/master/components/cluster), a cluster management component written in Golang. By using TiUP cluster, you can easily perform daily database operations, including deploying, starting, stopping, destroying, scaling, and upgrading a TiDB cluster, and manage TiDB cluster parameters. -TiUP supports deploying TiDB, TiFlash, [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) (deprecated), TiCDC, and the monitoring system. This document introduces how to deploy TiDB clusters of different topologies. +TiUP supports deploying TiDB, TiFlash, TiCDC, and the monitoring system. This document introduces how to deploy TiDB clusters of different topologies. ## Step 1. Prerequisites and precheck @@ -289,7 +289,6 @@ The following examples cover seven common scenarios. You need to modify the conf | OLTP | [Deploy minimal topology](/minimal-deployment-topology.md) | [Simple minimal configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-mini.yaml)
    [Full minimal configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-mini.yaml) | This is the basic cluster topology, including tidb-server, tikv-server, and pd-server. | | HTAP | [Deploy the TiFlash topology](/tiflash-deployment-topology.md) | [Simple TiFlash configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-tiflash.yaml)
    [Full TiFlash configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-tiflash.yaml) | This is to deploy TiFlash along with the minimal cluster topology. TiFlash is a columnar storage engine, and gradually becomes a standard cluster topology. | | Replicate incremental data using [TiCDC](/ticdc/ticdc-overview.md) | [Deploy the TiCDC topology](/ticdc-deployment-topology.md) | [Simple TiCDC configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-cdc.yaml)
    [Full TiCDC configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-cdc.yaml) | This is to deploy TiCDC along with the minimal cluster topology. TiCDC supports multiple downstream platforms, such as TiDB, MySQL, Kafka, MQ, and storage services. | -| Replicate incremental data using [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) | [Deploy the TiDB Binlog topology](/tidb-binlog-deployment-topology.md) | [Simple TiDB Binlog configuration template (MySQL as downstream)](https://github.com/pingcap/docs/blob/master/config-templates/simple-tidb-binlog.yaml)
    [Simple TiDB Binlog configuration template (Files as downstream)](https://github.com/pingcap/docs/blob/master/config-templates/simple-file-binlog.yaml)
    [Full TiDB Binlog configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-tidb-binlog.yaml) | This is to deploy TiDB Binlog along with the minimal cluster topology. | | Use OLAP on Spark | [Deploy the TiSpark topology](/tispark-deployment-topology.md) | [Simple TiSpark configuration template](https://github.com/pingcap/docs/blob/master/config-templates/simple-tispark.yaml)
    [Full TiSpark configuration template](https://github.com/pingcap/docs/blob/master/config-templates/complex-tispark.yaml) | This is to deploy TiSpark along with the minimal cluster topology. TiSpark is a component built for running Apache Spark on top of TiDB/TiKV to answer the OLAP queries. Currently, TiUP cluster's support for TiSpark is still **experimental**. | | Deploy multiple instances on a single machine | [Deploy a hybrid topology](/hybrid-deployment-topology.md) | [Simple configuration template for hybrid deployment](https://github.com/pingcap/docs/blob/master/config-templates/simple-multi-instance.yaml)
    [Full configuration template for hybrid deployment](https://github.com/pingcap/docs/blob/master/config-templates/complex-multi-instance.yaml) | The deployment topologies also apply when you need to add extra configurations for the directory, port, resource ratio, and label. | | Deploy TiDB clusters across data centers | [Deploy a geo-distributed deployment topology](/geo-distributed-deployment-topology.md) | [Configuration template for geo-distributed deployment](https://github.com/pingcap/docs/blob/master/config-templates/geo-redundancy-deployment.yaml) | This topology takes the typical architecture of three data centers in two cities as an example. It introduces the geo-distributed deployment architecture and the key configuration that requires attention. | diff --git a/releases/release-2.1-ga.md b/releases/release-2.1-ga.md index db3b5629a9bf1..bea8f9bbb67da 100644 --- a/releases/release-2.1-ga.md +++ b/releases/release-2.1-ga.md @@ -1,7 +1,7 @@ --- title: TiDB 2.1 GA Release Notes aliases: ['/docs/dev/releases/release-2.1-ga/','/docs/dev/releases/2.1ga/'] -summary: TiDB 2.1 GA was released on November 30, 2018, with significant improvements in stability, performance, compatibility, and usability. The release includes optimizations in SQL optimizer, SQL executor, statistics, expressions, server, DDL, compatibility, Placement Driver (PD), TiKV, and tools. It also introduces TiDB Lightning for fast full data import and supports new TiDB Binlog. However, TiDB 2.1 does not support downgrading to v2.0.x or earlier due to the adoption of the new storage engine. Additionally, parallel DDL is enabled in TiDB 2.1, so clusters with TiDB version earlier than 2.0.1 cannot upgrade to 2.1 using rolling update. If upgrading from TiDB 2.0.6 or earlier to TiDB 2.1, ongoing DDL operations may slow down the upgrading process. +summary: TiDB 2.1 GA was released on November 30, 2018, with significant improvements in stability, performance, compatibility, and usability. The release includes optimizations in SQL optimizer, SQL executor, statistics, expressions, server, DDL, compatibility, Placement Driver (PD), TiKV, and tools. It also introduces TiDB Lightning for fast full data import. However, TiDB 2.1 does not support downgrading to v2.0.x or earlier due to the adoption of the new storage engine. Additionally, parallel DDL is enabled in TiDB 2.1, so clusters with TiDB version earlier than 2.0.1 cannot upgrade to 2.1 using rolling update. If upgrading from TiDB 2.0.6 or earlier to TiDB 2.1, ongoing DDL operations may slow down the upgrading process. --- # TiDB 2.1 GA Release Notes @@ -261,7 +261,7 @@ On November 30, 2018, TiDB 2.1 GA is released. See the following updates in this - Fast full import of large amounts of data: [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md) -- Support new [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) +- Support new [TiDB Binlog](https://docs-archive.pingcap.com/tidb/v2.1/tidb-binlog-overview) ## Upgrade caveat diff --git a/releases/release-2.1-rc.5.md b/releases/release-2.1-rc.5.md index 46d91d91c955d..3f3376dda3443 100644 --- a/releases/release-2.1-rc.5.md +++ b/releases/release-2.1-rc.5.md @@ -1,7 +1,7 @@ --- title: TiDB 2.1 RC5 Release Notes aliases: ['/docs/dev/releases/release-2.1-rc.5/','/docs/dev/releases/21rc5/'] -summary: TiDB 2.1 RC5 was released on November 12, 2018, with improvements in stability, SQL optimizer, statistics, and execution engine. Fixes include issues with IndexReader, IndexScan Prepared statement, Union statement, and JSON data conversion. Server improvements include log readability, table data retrieval, and environment variable additions. PD fixes issues related to Region key reading, `regions/check` API, PD restart join, and event loss. TiKV improves error messages, adds panic mark file, downgrades grpcio, and adds an upper limit to the `kv_scan` interface. Tools now support the TiDB-Binlog cluster. +summary: TiDB 2.1 RC5 was released on November 12, 2018, with improvements in stability, SQL optimizer, statistics, and execution engine. Fixes include issues with IndexReader, IndexScan Prepared statement, Union statement, and JSON data conversion. Server improvements include log readability, table data retrieval, and environment variable additions. PD fixes issues related to Region key reading, `regions/check` API, PD restart join, and event loss. TiKV improves error messages, adds panic mark file, downgrades grpcio, and adds an upper limit to the `kv_scan` interface. --- @@ -61,4 +61,4 @@ On November 12, 2018, TiDB 2.1 RC5 is released. Compared with TiDB 2.1 RC4, this ## Tools -- Support the TiDB-Binlog cluster, which is not compatible with the older version of binlog [#8093](https://github.com/pingcap/tidb/pull/8093), [documentation](/tidb-binlog/tidb-binlog-overview.md) +- Support the TiDB-Binlog cluster, which is not compatible with the older version of binlog [#8093](https://github.com/pingcap/tidb/pull/8093), [documentation](https://docs-archive.pingcap.com/tidb/v2.1/tidb-binlog-overview) diff --git a/releases/release-7.5.0.md b/releases/release-7.5.0.md index 2c467f5c12564..9f221e6abdd8f 100644 --- a/releases/release-7.5.0.md +++ b/releases/release-7.5.0.md @@ -203,7 +203,7 @@ Starting from v7.5.0, the following contents are removed from the `TiDB-communit * TiKV-importer is deprecated in v7.5.0. It is strongly recommended that you use the [Physical Import Mode of TiDB Lightning](/tidb-lightning/tidb-lightning-physical-import-mode.md) as an alternative. -* Starting from TiDB v7.5.0, technical support for the data replication feature of [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) is no longer provided. It is strongly recommended to use [TiCDC](/ticdc/ticdc-overview.md) as an alternative solution for data replication. Although TiDB Binlog v7.5.0 still supports the Point-in-Time Recovery (PITR) scenario, this component will be completely deprecated in future versions. It is recommended to use [PITR](/br/br-pitr-guide.md) as an alternative solution for data recovery. +* Starting from TiDB v7.5.0, technical support for the data replication feature of [TiDB Binlog](https://docs.pingcap.com/tidb/v7.5/tidb-binlog-overview) is no longer provided. It is strongly recommended to use [TiCDC](/ticdc/ticdc-overview.md) as an alternative solution for data replication. Although TiDB Binlog v7.5.0 still supports the Point-in-Time Recovery (PITR) scenario, this component will be completely deprecated in future versions. It is recommended to use [PITR](/br/br-pitr-guide.md) as an alternative solution for data recovery. * The [`Fast Analyze`](/system-variables.md#tidb_enable_fast_analyze) feature (experimental) for statistics is deprecated in v7.5.0. diff --git a/releases/release-8.3.0.md b/releases/release-8.3.0.md index 686282479cdfd..32719c30a6202 100644 --- a/releases/release-8.3.0.md +++ b/releases/release-8.3.0.md @@ -239,7 +239,7 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v8.3/quick-start-with- * The following features are deprecated starting from v8.3.0: - * Starting from v7.5.0, [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). + * Starting from v7.5.0, [TiDB Binlog](https://docs.pingcap.com/tidb/v8.3/tidb-binlog-overview) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). * Starting from v8.3.0, the [`tidb_enable_column_tracking`](/system-variables.md#tidb_enable_column_tracking-new-in-v540) system variable is deprecated. TiDB tracks predicate columns by default. For more information, see [`tidb_analyze_column_options`](/system-variables.md#tidb_analyze_column_options-new-in-v830). * The following features are planned for deprecation in future versions: diff --git a/resources/doc-templates/patch_release_note_template_zh.md b/resources/doc-templates/patch_release_note_template_zh.md index 400d5628aea1f..c8cb99998c422 100644 --- a/resources/doc-templates/patch_release_note_template_zh.md +++ b/resources/doc-templates/patch_release_note_template_zh.md @@ -80,12 +80,6 @@ TiDB 版本:x.y.z - note [#issue](https://github.com/pingcap/tiup/issues/${issue-id}) @[贡献者 GitHub ID](https://github.com/${github-id}) - placeholder - + TiDB Binlog - - - note [#issue](https://github.com/pingcap/tidb-binlog/issues/${issue-id}) @[贡献者 GitHub ID](https://github.com/${github-id}) - - note [#issue](https://github.com/pingcap/tidb-binlog/issues/${issue-id}) @[贡献者 GitHub ID](https://github.com/${github-id}) - - placeholder - ## 错误修复 + TiDB @@ -150,12 +144,6 @@ TiDB 版本:x.y.z - note [#issue](https://github.com/pingcap/tiup/issues/${issue-id}) @[贡献者 GitHub ID](https://github.com/${github-id}) - placeholder - + TiDB Binlog - - - note [#issue](https://github.com/pingcap/tidb-binlog/issues/${issue-id}) @[贡献者 GitHub ID](https://github.com/${github-id}) - - note [#issue](https://github.com/pingcap/tidb-binlog/issues/${issue-id}) @[贡献者 GitHub ID](https://github.com/${github-id}) - - placeholder - ## Other dup notes - placeholder \ No newline at end of file diff --git a/scale-tidb-using-tiup.md b/scale-tidb-using-tiup.md index 52cf7052c09cd..a3f4d185b623b 100644 --- a/scale-tidb-using-tiup.md +++ b/scale-tidb-using-tiup.md @@ -284,7 +284,7 @@ This section exemplifies how to remove a TiKV node from the `10.0.1.5` host. > **Note:** > > - You can take similar steps to remove a TiDB or PD node. -> - Because the TiKV, TiFlash, and TiDB Binlog components are taken offline asynchronously and the stopping process takes a long time, TiUP takes them offline in different methods. For details, see [Particular handling of components' offline process](/tiup/tiup-component-cluster-scale-in.md#particular-handling-of-components-offline-process). +> - Because the TiKV and TiFlash components are taken offline asynchronously and the stopping process takes a long time, TiUP takes them offline in different methods. For details, see [Particular handling of components' offline process](/tiup/tiup-component-cluster-scale-in.md#particular-handling-of-components-offline-process). > - The PD Client in TiKV caches the list of PD nodes. The current version of TiKV has a mechanism to automatically and regularly update PD nodes, which can help mitigate the issue of an expired list of PD nodes cached by TiKV. However, after scaling out PD, you should try to avoid directly removing all PD nodes at once that exist before the scaling. If necessary, before making all the previously existing PD nodes offline, make sure to switch the PD leader to a newly added PD node. 1. View the node ID information: diff --git a/sql-statements/sql-statement-alter-table-compact.md b/sql-statements/sql-statement-alter-table-compact.md index 22b4864d7fc9b..73c79c3e7b695 100644 --- a/sql-statements/sql-statement-alter-table-compact.md +++ b/sql-statements/sql-statement-alter-table-compact.md @@ -203,9 +203,9 @@ SELECT PARTITION_NAME, TOTAL_DELTA_ROWS, TOTAL_STABLE_ROWS The `ALTER TABLE ... COMPACT` syntax is TiDB specific, which is an extension to the standard SQL syntax. Although there is no equivalent MySQL syntax, you can still execute this statement by using MySQL clients or various database drivers that comply with the MySQL protocol. -### TiDB Binlog and TiCDC compatibility +### TiCDC compatibility -The `ALTER TABLE ... COMPACT` statement does not result in logical data changes and are therefore not replicated to downstream by TiDB Binlog or TiCDC. +The `ALTER TABLE ... COMPACT` statement does not result in logical data changes and are therefore not replicated to downstream by TiCDC. ## See also diff --git a/sql-statements/sql-statement-change-drainer.md b/sql-statements/sql-statement-change-drainer.md deleted file mode 100644 index 75039ac38957c..0000000000000 --- a/sql-statements/sql-statement-change-drainer.md +++ /dev/null @@ -1,75 +0,0 @@ ---- -title: CHANGE DRAINER -summary: An overview of the usage of CHANGE DRAINER for the TiDB database. -aliases: ['/docs/dev/sql-statements/sql-statement-change-drainer/'] ---- - -# CHANGE DRAINER - -The `CHANGE DRAINER` statement modifies the status information for Drainer in the cluster. - -> **Note:** -> -> This feature is only applicable to TiDB Self-Managed and not available on [TiDB Cloud](https://docs.pingcap.com/tidbcloud/). - -> **Tip:** -> -> Drainer's state is automatically reported to PD while running. Only when Drainer is under abnormal circumstances and its state is inconsistent with the state information stored in PD, you can use the `CHANGE DRAINER` statement to modify the state information stored in PD. - -## Examples - -{{< copyable "sql" >}} - -```sql -SHOW DRAINER STATUS; -``` - -```sql -+----------|----------------|--------|--------------------|---------------------| -| NodeID | Address | State | Max_Commit_Ts | Update_Time | -+----------|----------------|--------|--------------------|---------------------| -| drainer1 | 127.0.0.3:8249 | Online | 408553768673342532 | 2019-04-30 00:00:03 | -+----------|----------------|--------|--------------------|---------------------| -| drainer2 | 127.0.0.4:8249 | Online | 408553768673345531 | 2019-05-01 00:00:04 | -+----------|----------------|--------|--------------------|---------------------| -2 rows in set (0.00 sec) -``` - -It can be seen that drainer1's state has not been updated for more than a day, the Drainer is in an abnormal state, but the `State` remains `Online`. After using `CHANGE DRAINER`, the Drainer's `State` is changed to 'paused': - -{{< copyable "sql" >}} - -```sql -CHANGE DRAINER TO NODE_STATE ='paused' FOR NODE_ID 'drainer1'; -``` - -```sql -Query OK, 0 rows affected (0.01 sec) -``` - -{{< copyable "sql" >}} - -```sql -SHOW DRAINER STATUS; -``` - -```sql -+----------|----------------|--------|--------------------|---------------------| -| NodeID | Address | State | Max_Commit_Ts | Update_Time | -+----------|----------------|--------|--------------------|---------------------| -| drainer1 | 127.0.0.3:8249 | Paused | 408553768673342532 | 2019-04-30 00:00:03 | -+----------|----------------|--------|--------------------|---------------------| -| drainer2 | 127.0.0.4:8249 | Online | 408553768673345531 | 2019-05-01 00:00:04 | -+----------|----------------|--------|--------------------|---------------------| -2 rows in set (0.00 sec) -``` - -## MySQL compatibility - -This statement is a TiDB extension to MySQL syntax. - -## See also - -* [SHOW PUMP STATUS](/sql-statements/sql-statement-show-pump-status.md) -* [SHOW DRAINER STATUS](/sql-statements/sql-statement-show-drainer-status.md) -* [CHANGE PUMP STATUS](/sql-statements/sql-statement-change-pump.md) diff --git a/sql-statements/sql-statement-change-pump.md b/sql-statements/sql-statement-change-pump.md deleted file mode 100644 index 26e1f92ef9c8f..0000000000000 --- a/sql-statements/sql-statement-change-pump.md +++ /dev/null @@ -1,75 +0,0 @@ ---- -title: CHANGE PUMP -summary: An overview of the usage of CHANGE PUMP for the TiDB database. -aliases: ['/docs/dev/sql-statements/sql-statement-change-pump/'] ---- - -# CHANGE PUMP - -The `CHANGE PUMP` statement modifies the status information for Pump in the cluster. - -> **Note:** -> -> This feature is only applicable to TiDB Self-Managed and not available on [TiDB Cloud](https://docs.pingcap.com/tidbcloud/). - -> **Tip:** -> -> Pump's state is automatically reported to PD while running. Only when Pump is under abnormal circumstances and its state is inconsistent with the state information stored in PD, you can use the `CHANGE PUMP` statement to modify the state information stored in PD. - -## Examples - -{{< copyable "sql" >}} - -```sql -SHOW PUMP STATUS; -``` - -```sql -+--------|----------------|--------|--------------------|---------------------| -| NodeID | Address | State | Max_Commit_Ts | Update_Time | -+--------|----------------|--------|--------------------|---------------------| -| pump1 | 127.0.0.1:8250 | Online | 408553768673342237 | 2019-04-30 00:00:01 | -+--------|----------------|--------|--------------------|---------------------| -| pump2 | 127.0.0.2:8250 | Online | 408553768673342335 | 2019-05-01 00:00:02 | -+--------|----------------|--------|--------------------|---------------------| -2 rows in set (0.00 sec) -``` - -It can be seen that pump1's state has not been updated for more than a day, the Pump is in an abnormal state, but the `State` remains `Online`. After using `CHANGE PUMP`, the Pump's `State` is changed to 'paused' : - -{{< copyable "sql" >}} - -```sql -CHANGE PUMP TO NODE_STATE ='paused' FOR NODE_ID 'pump1'; -``` - -```sql -Query OK, 0 rows affected (0.01 sec) -``` - -{{< copyable "sql" >}} - -```sql -SHOW PUMP STATUS; -``` - -```sql -+--------|----------------|--------|--------------------|---------------------| -| NodeID | Address | State | Max_Commit_Ts | Update_Time | -+--------|----------------|--------|--------------------|---------------------| -| pump1 | 127.0.0.1:8250 | Paused | 408553768673342237 | 2019-04-30 00:00:01 | -+--------|----------------|--------|--------------------|---------------------| -| pump2 | 127.0.0.2:8250 | Online | 408553768673342335 | 2019-05-01 00:00:02 | -+--------|----------------|--------|--------------------|---------------------| -2 rows in set (0.00 sec) -``` - -## MySQL compatibility - -This statement is a TiDB extension to MySQL syntax. - -## See also - -* [SHOW PUMP STATUS](/sql-statements/sql-statement-show-pump-status.md) -* [SHOW DRAINER STATUS](/sql-statements/sql-statement-show-drainer-status.md) -* [CHANGE DRAINER STATUS](/sql-statements/sql-statement-change-drainer.md) diff --git a/sql-statements/sql-statement-flashback-database.md b/sql-statements/sql-statement-flashback-database.md index 5534d31294a8f..e9571a1c44e74 100644 --- a/sql-statements/sql-statement-flashback-database.md +++ b/sql-statements/sql-statement-flashback-database.md @@ -36,12 +36,6 @@ FlashbackToNewName ::= * You cannot restore the same database multiple times using the `FLASHBACK DATABASE` statement. Because the database restored by `FLASHBACK DATABASE` has the same schema ID as the original database, restoring the same database multiple times leads to duplicate schema IDs. In TiDB, the database schema ID must be globally unique. -* When TiDB Binlog (deprecated) is enabled, note the following when you use `FLASHBACK DATABASE`: - - * The downstream secondary database must support `FLASHBACK DATABASE`. - * The GC life time of the secondary database must be longer than that of the primary database. Otherwise, the latency between the upstream and the downstream might lead to data restoration failure in the downstream. - * If TiDB Binlog replication encounters an error, you need to filter out the database in TiDB Binlog and then manually import full data for this database. - ## Example - Restore the `test` database that is deleted by `DROP`: diff --git a/sql-statements/sql-statement-flashback-table.md b/sql-statements/sql-statement-flashback-table.md index dfe9f0a82d1e3..b2ceffbb15023 100644 --- a/sql-statements/sql-statement-flashback-table.md +++ b/sql-statements/sql-statement-flashback-table.md @@ -43,13 +43,6 @@ FlashbackToNewName ::= If a table is dropped and the GC lifetime has passed, you can no longer use the `FLASHBACK TABLE` statement to recover the dropped data. Otherwise, an error like `Can't find dropped / truncated table 't' in GC safe point 2020-03-16 16:34:52 +0800 CST` will be returned. -Pay attention to the following conditions and requirements when you enable TiDB Binlog (deprecated) and use the `FLASHBACK TABLE` statement: - -* The downstream secondary cluster must also support `FLASHBACK TABLE`. -* The GC lifetime of the secondary cluster must be longer than that of the primary cluster. -* The delay of replication between the upstream and downstream might also cause the failure to recover data to the downstream. -* If an error occurs when TiDB Binlog is replicating a table, you need to filter that table in TiDB Binlog and manually import all data of that table. - ## Example - Recover the table data dropped by the `DROP` operation: diff --git a/sql-statements/sql-statement-overview.md b/sql-statements/sql-statement-overview.md index bd34543f9460a..bce1656d9d276 100644 --- a/sql-statements/sql-statement-overview.md +++ b/sql-statements/sql-statement-overview.md @@ -290,18 +290,14 @@ TiDB uses SQL statements that aim to follow ISO/IEC SQL standards, with extensio | [`SHOW GRANTS`](/sql-statements/sql-statement-show-grants.md) | Shows privileges associated with a user. | | [`SHOW PRIVILEGES`](/sql-statements/sql-statement-show-privileges.md) | Shows available privileges. | -## TiCDC & TiDB Binlog +## TiCDC | SQL Statement | Description | |---------------|-------------| | [`ADMIN [SET\|SHOW\|UNSET] BDR ROLE`](/sql-statements/sql-statement-admin-bdr-role.md) | Manages BDR roles. | -| [`CHANGE DRAINER`](/sql-statements/sql-statement-change-drainer.md) | Modifies the status information for Drainer in the cluster. | -| [`CHANGE PUMP`](/sql-statements/sql-statement-change-pump.md) | Modifies the status information for Pump in the cluster. | -| [`SHOW DRAINER STATUS`](/sql-statements/sql-statement-show-drainer-status.md) | Shows the status for all Drainer nodes in the cluster. | | [`SHOW MASTER STATUS`](/sql-statements/sql-statement-show-master-status.md) | Shows the latest TSO in the cluster. | -| [`SHOW PUMP STATUS`](/sql-statements/sql-statement-show-pump-status.md) | Shows the status information for all Pump nodes in the cluster. | @@ -309,7 +305,7 @@ TiDB uses SQL statements that aim to follow ISO/IEC SQL standards, with extensio > **Note:** > -> [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) & [TiDB Binlog](https://docs.pingcap.com/tidb/stable/tidb-binlog-overview) are tools for replicating TiDB data to the upstream for TiDB Self-Managed. Most SQL statements for TiCDC and TiDB Binlog are not applicable to TiDB Cloud. For TiDB Cloud, you can use the [Changefeed](/tidb-cloud/changefeed-overview.md) feature in the [TiDB Cloud console](https://tidbcloud.com) instead to stream data. +> [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) & [TiDB Binlog](https://docs.pingcap.com/tidb/v8.3/tidb-binlog-overview) are tools for replicating TiDB data to the upstream for TiDB Self-Managed. Most SQL statements for TiCDC and TiDB Binlog are not applicable to TiDB Cloud. For TiDB Cloud, you can use the [Changefeed](/tidb-cloud/changefeed-overview.md) feature in the [TiDB Cloud console](https://tidbcloud.com) instead to stream data. | SQL Statement | Description | |---------------|-------------| diff --git a/sql-statements/sql-statement-recover-table.md b/sql-statements/sql-statement-recover-table.md index 5bfb62e9259f2..6b879fad6cb88 100644 --- a/sql-statements/sql-statement-recover-table.md +++ b/sql-statements/sql-statement-recover-table.md @@ -38,31 +38,7 @@ NUM ::= intLit > **Note:** > -> + If a table is deleted and the GC lifetime is out, the table cannot be recovered with `RECOVER TABLE`. Execution of `RECOVER TABLE` in this scenario returns an error like: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. -> -> + If the TiDB version is 3.0.0 or later, it is not recommended for you to use `RECOVER TABLE` when TiDB Binlog (deprecated) is used. -> -> + `RECOVER TABLE` is supported in the Binlog version 3.0.1, so you can use `RECOVER TABLE` in the following three situations: -> -> - Binlog version is 3.0.1 or later. -> - TiDB 3.0 is used both in the upstream cluster and the downstream cluster. -> - The GC life time of the secondary cluster must be longer than that of the primary cluster. However, as latency occurs during data replication between upstream and downstream databases, data recovery might fail in the downstream. - - - -**Troubleshoot errors during TiDB Binlog replication** - -When you use `RECOVER TABLE` in the upstream TiDB during TiDB Binlog replication, TiDB Binlog might be interrupted in the following three situations: - -+ The downstream database does not support the `RECOVER TABLE` statement. An error instance: `check the manual that corresponds to your MySQL server version for the right syntax to use near 'RECOVER TABLE table_name'`. - -+ The GC life time is not consistent between the upstream database and the downstream database. An error instance: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. - -+ Latency occurs during replication between upstream and downstream databases. An error instance: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. - -For the above three situations, you can resume data replication from TiDB Binlog with a [full import of the deleted table](/ecosystem-tool-user-guide.md#backup-and-restore---backup--restore-br). - - +> If a table is deleted and the GC lifetime is out, the table cannot be recovered with `RECOVER TABLE`. Execution of `RECOVER TABLE` in this scenario returns an error like: `snapshot is older than GC safe point 2019-07-10 13:45:57 +0800 CST`. ## Examples diff --git a/sql-statements/sql-statement-savepoint.md b/sql-statements/sql-statement-savepoint.md index 735ef8ccf352b..02829ad93d8c5 100644 --- a/sql-statements/sql-statement-savepoint.md +++ b/sql-statements/sql-statement-savepoint.md @@ -15,8 +15,7 @@ RELEASE SAVEPOINT identifier > **Warning:** > -> - You cannot use `SAVEPOINT` with TiDB Binlog enabled. -> - You cannot use `SAVEPOINT` in pessimistic transactions when [`tidb_constraint_check_in_place_pessimistic`](/system-variables.md#tidb_constraint_check_in_place_pessimistic-new-in-v630) is disabled. +> You cannot use `SAVEPOINT` in pessimistic transactions when [`tidb_constraint_check_in_place_pessimistic`](/system-variables.md#tidb_constraint_check_in_place_pessimistic-new-in-v630) is disabled. - `SAVEPOINT` is used to set a savepoint of a specified name in the current transaction. If a savepoint with the same name already exists, it will be deleted and a new savepoint with the same name will be set. diff --git a/sql-statements/sql-statement-show-drainer-status.md b/sql-statements/sql-statement-show-drainer-status.md deleted file mode 100644 index ecaf88606e2dc..0000000000000 --- a/sql-statements/sql-statement-show-drainer-status.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: SHOW DRAINER STATUS -summary: An overview of the usage of SHOW DRAINER STATUS for the TiDB database. -aliases: ['/docs/dev/sql-statements/sql-statement-show-drainer-status/'] ---- - -# SHOW DRAINER STATUS - -The `SHOW DRAINER STATUS` statement displays the status information for all Drainer nodes in the cluster. - -> **Note:** -> -> This feature is only applicable to TiDB Self-Managed and not available on [TiDB Cloud](https://docs.pingcap.com/tidbcloud/). - -## Examples - -{{< copyable "sql" >}} - -```sql -SHOW DRAINER STATUS; -``` - -```sql -+----------|----------------|--------|--------------------|---------------------| -| NodeID | Address | State | Max_Commit_Ts | Update_Time | -+----------|----------------|--------|--------------------|---------------------| -| drainer1 | 127.0.0.3:8249 | Online | 408553768673342532 | 2019-05-01 00:00:03 | -+----------|----------------|--------|--------------------|---------------------| -| drainer2 | 127.0.0.4:8249 | Online | 408553768673345531 | 2019-05-01 00:00:04 | -+----------|----------------|--------|--------------------|---------------------| -2 rows in set (0.00 sec) -``` - -## MySQL compatibility - -This statement is a TiDB extension to MySQL syntax. - -## See also - -* [SHOW PUMP STATUS](/sql-statements/sql-statement-show-pump-status.md) -* [CHANGE PUMP STATUS](/sql-statements/sql-statement-change-pump.md) -* [CHANGE DRAINER STATUS](/sql-statements/sql-statement-change-drainer.md) diff --git a/sql-statements/sql-statement-show-master-status.md b/sql-statements/sql-statement-show-master-status.md index 0ba461b6a5669..4c654f8b19620 100644 --- a/sql-statements/sql-statement-show-master-status.md +++ b/sql-statements/sql-statement-show-master-status.md @@ -29,21 +29,4 @@ SHOW MASTER STATUS; The output of `SHOW MASTER STATUS` is designed to match MySQL. However, the execution results are different in that the MySQL result is the binlog location information and the TiDB result is the latest TSO information. -The `SHOW BINARY LOG STATUS` statement was added in TiDB as an alias for `SHOW MASTER STATUS`, which has been deprecated in MySQL 8.2.0 and newer versions. - -## See also - - - -* [SHOW PUMP STATUS](/sql-statements/sql-statement-show-pump-status.md) -* [SHOW DRAINER STATUS](/sql-statements/sql-statement-show-drainer-status.md) -* [CHANGE PUMP STATUS](/sql-statements/sql-statement-change-pump.md) -* [CHANGE DRAINER STATUS](/sql-statements/sql-statement-change-drainer.md) - - - - - -* [`SHOW TABLE STATUS`](/sql-statements/sql-statement-show-table-status.md) - - +The `SHOW BINARY LOG STATUS` statement was added in TiDB as an alias for `SHOW MASTER STATUS`, which has been deprecated in MySQL 8.2.0 and newer versions. diff --git a/sql-statements/sql-statement-show-pump-status.md b/sql-statements/sql-statement-show-pump-status.md deleted file mode 100644 index 43b0c4c970a7e..0000000000000 --- a/sql-statements/sql-statement-show-pump-status.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: SHOW PUMP STATUS -summary: An overview of the usage of SHOW PUMP STATUS for the TiDB database. -aliases: ['/docs/dev/sql-statements/sql-statement-show-pump-status/'] ---- - -# SHOW PUMP STATUS - -The `SHOW PUMP STATUS` statement displays the status information for all Pump nodes in the cluster. - -> **Note:** -> -> This feature is only applicable to TiDB Self-Managed and not available on [TiDB Cloud](https://docs.pingcap.com/tidbcloud/). - -## Examples - -{{< copyable "sql" >}} - -```sql -SHOW PUMP STATUS; -``` - -```sql -+--------|----------------|--------|--------------------|---------------------| -| NodeID | Address | State | Max_Commit_Ts | Update_Time | -+--------|----------------|--------|--------------------|---------------------| -| pump1 | 127.0.0.1:8250 | Online | 408553768673342237 | 2019-05-01 00:00:01 | -+--------|----------------|--------|--------------------|---------------------| -| pump2 | 127.0.0.2:8250 | Online | 408553768673342335 | 2019-05-01 00:00:02 | -+--------|----------------|--------|--------------------|---------------------| -2 rows in set (0.00 sec) -``` - -## MySQL compatibility - -This statement is a TiDB extension to MySQL syntax. - -## See also - -* [SHOW DRAINER STATUS](/sql-statements/sql-statement-show-drainer-status.md) -* [CHANGE PUMP STATUS](/sql-statements/sql-statement-change-pump.md) -* [CHANGE DRAINER STATUS](/sql-statements/sql-statement-change-drainer.md) diff --git a/system-variables.md b/system-variables.md index d711456de8978..56c5834609099 100644 --- a/system-variables.md +++ b/system-variables.md @@ -644,14 +644,6 @@ This variable is an alias for [`last_insert_id`](#last_insert_id). - Default value: `Apache License 2.0` - This variable indicates the license of your TiDB server installation. -### log_bin - -- Scope: NONE -- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No -- Type: Boolean -- Default value: `OFF` -- This variable indicates whether [TiDB Binlog](https://docs.pingcap.com/tidb/stable/tidb-binlog-overview) (deprecated) is used. - ### max_allowed_packet New in v6.1.0 > **Note:** @@ -912,23 +904,6 @@ mysql> SHOW GLOBAL VARIABLES LIKE 'max_prepared_stmt_count'; - Default value: "" - The local unix socket file that the `tidb-server` is listening on when speaking the MySQL protocol. -### sql_log_bin - -> **Note:** -> -> This variable is read-only for [TiDB Cloud Serverless](https://docs.pingcap.com/tidbcloud/select-cluster-tier#tidb-cloud-serverless). - -- Scope: SESSION | GLOBAL -- Persists to cluster: Yes -- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No -- Type: Boolean -- Default value: `ON` -- Indicates whether to write changes to [TiDB Binlog](https://docs.pingcap.com/tidb/stable/tidb-binlog-overview) or not. - -> **Note:** -> -> It is not recommended to set `sql_log_bin` as a global variable because the future versions of TiDB might only allow setting this as a session variable. - ### sql_mode - Scope: SESSION | GLOBAL @@ -1865,7 +1840,6 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; > **Note:** > > - The default value of `ON` only applies to new clusters. if your cluster was upgraded from an earlier version of TiDB, the value `OFF` will be used instead. -> - If you have enabled TiDB Binlog (deprecated), enabling this variable cannot improve the performance. To improve the performance, it is recommended to use [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) instead. > - Enabling this parameter only means that one-phase commit becomes an optional mode of transaction commit. In fact, the most suitable mode of transaction commit is determined by TiDB. ### tidb_enable_analyze_snapshot New in v6.2.0 @@ -1898,7 +1872,6 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1; > **Note:** > > - The default value of `ON` only applies to new clusters. if your cluster was upgraded from an earlier version of TiDB, the value `OFF` will be used instead. -> - If you have enabled TiDB Binlog (deprecated), enabling this variable cannot improve the performance. To improve the performance, it is recommended to use [TiCDC](https://docs.pingcap.com/tidb/stable/ticdc-overview) instead. > - Enabling this parameter only means that Async Commit becomes an optional mode of transaction commit. In fact, the most suitable mode of transaction commit is determined by TiDB. ### tidb_enable_auto_analyze New in v6.1.0 diff --git a/table-attributes.md b/table-attributes.md index 371a9fce4e9b7..850e761e01446 100644 --- a/table-attributes.md +++ b/table-attributes.md @@ -21,7 +21,7 @@ Currently, TiDB only supports adding the `merge_option` attribute to a table or > **Note:** > -> - When you use TiDB Binlog or TiCDC to perform replication or use BR to perform incremental backup, the replication or backup operations skip the DDL statement that sets table attributes. To use table attributes in the downstream or in the backup cluster, you need to manually execute the DDL statement in the downstream or in the backup cluster. +> When you use TiCDC to perform replication or use BR to perform incremental backup, the replication or backup operations skip the DDL statement that sets table attributes. To use table attributes in the downstream or in the backup cluster, you need to manually execute the DDL statement in the downstream or in the backup cluster. ## Usage diff --git a/three-data-centers-in-two-cities-deployment.md b/three-data-centers-in-two-cities-deployment.md index d8b5e661fb2ad..040f929c9bcf8 100644 --- a/three-data-centers-in-two-cities-deployment.md +++ b/three-data-centers-in-two-cities-deployment.md @@ -56,8 +56,6 @@ In the rac1 of AZ1, one server is deployed with TiDB and PD services, and the ot The TiDB server, the control machine, and the monitoring server are on rac3. The TiDB server is deployed for regular maintenance and backup. Prometheus, Grafana, and the restore tools are deployed on the control machine and monitoring machine. -Another backup server can be added to deploy Drainer. Drainer saves binlog data to a specified location by outputting files, to achieve incremental backup. - ## Configuration ### Example diff --git a/ticdc-deployment-topology.md b/ticdc-deployment-topology.md index a2ef9c8acfe8f..69ec504b9bd07 100644 --- a/ticdc-deployment-topology.md +++ b/ticdc-deployment-topology.md @@ -12,7 +12,7 @@ aliases: ['/docs/dev/ticdc-deployment-topology/'] This document describes the deployment topology of [TiCDC](/ticdc/ticdc-overview.md) based on the minimal cluster topology. -TiCDC is a tool for replicating the incremental data of TiDB, introduced in TiDB 4.0. It supports multiple downstream platforms, such as TiDB, MySQL, Kafka, MQ, and storage services. Compared with TiDB Binlog (deprecated), TiCDC has lower latency and native high availability. +TiCDC is a tool for replicating the incremental data of TiDB, introduced in TiDB 4.0. It supports multiple downstream platforms, such as TiDB, MySQL, Kafka, MQ, and storage services. TiCDC has low latency and native high availability. ## Topology information diff --git a/ticdc/deploy-ticdc.md b/ticdc/deploy-ticdc.md index dcef00f6019d2..ed46e9bbfc38f 100644 --- a/ticdc/deploy-ticdc.md +++ b/ticdc/deploy-ticdc.md @@ -128,8 +128,6 @@ This section describes how to use the [`tiup cluster edit-config`](/tiup/tiup-co pd: {} tiflash: {} tiflash-learner: {} - pump: {} - drainer: {} cdc: gc-ttl: 172800 ``` diff --git a/tidb-binlog-deployment-topology.md b/tidb-binlog-deployment-topology.md deleted file mode 100644 index 492d8ce62b430..0000000000000 --- a/tidb-binlog-deployment-topology.md +++ /dev/null @@ -1,60 +0,0 @@ ---- -title: TiDB Binlog Deployment Topology -summary: Learn the deployment topology of TiDB Binlog based on the minimal TiDB topology. -aliases: ['/docs/dev/tidb-binlog-deployment-topology/'] ---- - -# TiDB Binlog Deployment Topology - -This document describes the deployment topology of [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) based on the minimal TiDB topology. TiDB Binlog provides near real-time backup and replication. - -> **Warning:** -> -> Starting from v7.5.0, [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). - -## Topology information - -| Instance | Count | Physical machine configuration | IP | Configuration | -| :-- | :-- | :-- | :-- | :-- | -| TiDB | 3 | 16 VCore 32 GB | 10.0.1.1
    10.0.1.2
    10.0.1.3 | Default port configuration;
    Enable `enable_binlog`;
    Enable `ignore-error` | -| PD | 3 | 4 VCore 8 GB | 10.0.1.4
    10.0.1.5
    10.0.1.6 | Default port configuration | -| TiKV | 3 | 16 VCore 32 GB | 10.0.1.7
    10.0.1.8
    10.0.1.9 | Default port configuration | -| Pump| 3 | 8 VCore 16GB | 10.0.1.1
    10.0.1.7
    10.0.1.8 | Default port configuration;
    Set GC time to 7 days | -| Drainer | 1 | 8 VCore 16GB | 10.0.1.12 | Default port configuration;
    Set the default initialization commitTS -1 as the latest timestamp;
    Configure the downstream target TiDB as `10.0.1.12:4000` | - -### Topology templates - -- [The simple template for the TiDB Binlog topology (with `mysql` as the downstream type)](https://github.com/pingcap/docs/blob/master/config-templates/simple-tidb-binlog.yaml) -- [The simple template for the TiDB Binlog topology (with `file` as the downstream type)](https://github.com/pingcap/docs/blob/master/config-templates/simple-file-binlog.yaml) -- [The complex template for the TiDB Binlog topology](https://github.com/pingcap/docs/blob/master/config-templates/complex-tidb-binlog.yaml) - -For detailed descriptions of the configuration items in the above TiDB cluster topology file, see [Topology Configuration File for Deploying TiDB Using TiUP](/tiup/tiup-cluster-topology-reference.md). - -### Key parameters - -The key parameters in the topology configuration templates are as follows: - -- `server_configs.tidb.binlog.enable: true` - - - Enables the binlog service. - - Default value: `false`. - -- `server_configs.tidb.binlog.ignore-error: true` - - - It is recommended to enable this configuration in high availability scenarios. - - If set to `true`, when an error occurs, TiDB stops writing data into binlog, and adds `1` to the value of the `tidb_server_critical_error_total` monitoring metric. - - If set to `false`, when TiDB fails to write data into binlog, the whole TiDB service is stopped. - -- `drainer_servers.config.syncer.db-type` - - The downstream type of TiDB Binlog. Currently, `mysql`, `tidb`, `kafka`, and `file` are supported. - -- `drainer_servers.config.syncer.to` - - The downstream configuration of TiDB Binlog. Depending on different `db-type`s, you can use this configuration item to configure the connection parameters of the downstream database, the connection parameters of Kafka, and the file save path. For details, refer to [TiDB Binlog Configuration File](/tidb-binlog/tidb-binlog-configuration-file.md#syncerto). - -> **Note:** -> -> - When editing the configuration file template, if you do not need custom ports or directories, modify the IP only. -> - You do not need to manually create the `tidb` user in the configuration file. The TiUP cluster component automatically creates the `tidb` user on the target machines. You can customize the user, or keep the user consistent with the control machine. -> - If you configure the deployment directory as a relative path, the cluster will be deployed in the home directory of the user. diff --git a/tidb-binlog/bidirectional-replication-between-tidb-clusters.md b/tidb-binlog/bidirectional-replication-between-tidb-clusters.md deleted file mode 100644 index 64e5675b0efb3..0000000000000 --- a/tidb-binlog/bidirectional-replication-between-tidb-clusters.md +++ /dev/null @@ -1,131 +0,0 @@ ---- -title: Bidirectional Replication Between TiDB Clusters -summary: Learn how to perform the bidirectional replication between TiDB clusters. -aliases: ['/docs/dev/tidb-binlog/bidirectional-replication-between-tidb-clusters/','/docs/dev/reference/tidb-binlog/bidirectional-replication/'] ---- - -# Bidirectional Replication between TiDB Clusters - -> **Warning:** -> -> - Starting from v7.5.0, [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). -> - TiDB Binlog is not compatible with some features introduced in TiDB v5.0 and they cannot be used together. For details, see [Notes](/tidb-binlog/tidb-binlog-overview.md#notes). - -This document describes the bidirectional replication between two TiDB clusters, how the replication works, how to enable it, and how to replicate DDL operations. - -## User scenario - -If you want two TiDB clusters to exchange data changes with each other, TiDB Binlog allows you to do that. For example, you want cluster A and cluster B to replicate data with each other. - -> **Note:** -> -> The data written to these two clusters must be conflict-free, that is, in the two clusters, the same primary key or the rows with the unique index of the tables must not be modified. - -The user scenario is shown as below: - -![Architect](/media/binlog/bi-repl1.jpg) - -## Implementation details - -![Mark Table](/media/binlog/bi-repl2.png) - -If the bidirectional replication is enabled between cluster A and cluster B, the data written to cluster A will be replicated to cluster B, and then these data changes will be replicated back to cluster A, which causes an infinite loop of replication. From the figure above, you can see that during the data replication, Drainer marks the binlog events, and filters out the marked events to avoid such a replication loop. - -The detailed implementation is described as follows: - -1. Start the TiDB Binlog replication program for each of the two clusters. -2. When the transaction to be replicated passes through the Drainer of cluster A, this Drainer adds the [`_drainer_repl_mark` table](#mark-table) to the transaction, writes this DML event update to the mark table, and replicate this transaction to cluster B. -3. Cluster B returns binlog events with the `_drainer_repl_mark` mark table to cluster A. The Drainer of cluster B identifies the mark table with the DML event when parsing the binlog event, and gives up replicating this binlog event to cluster A. - -The replication process from cluster B to cluster A is the same as above. The two clusters can be upstream and downstream of each other. - -> **Note:** -> -> * When updating the `_drainer_repl_mark` mark table, data changes are required to generate binlogs. -> * DDL operations are not transactional, so you need to use the one-way replication method to replicate DDL operations. See [Replicate DDL operations](#replicate-ddl-operations) for details. - -Drainer can use a unique ID for each connection to downstream to avoid conflicts. `channel_id` is used to indicate a channel for bidirectional replication. The two clusters should have the same `channel_id` configuration (with the same value). - -If you add or delete columns in the upstream, there might be extra or missing columns of the data to be replicated to the downstream. Drainer allows this situation by ignoring the extra columns or by inserting default values to the missing columns. - -## Mark table - -The `_drainer_repl_mark` mark table has the following structure: - -{{< copyable "sql" >}} - -```sql -CREATE TABLE `_drainer_repl_mark` ( - `id` bigint(20) NOT NULL, - `channel_id` bigint(20) NOT NULL DEFAULT '0', - `val` bigint(20) DEFAULT '0', - `channel_info` varchar(64) DEFAULT NULL, - PRIMARY KEY (`id`,`channel_id`) -); -``` - -Drainer uses the following SQL statement to update `_drainer_repl_mark`, which ensures data change and the generation of binlog: - -{{< copyable "sql" >}} - -```sql -update drainer_repl_mark set val = val + 1 where id = ? && channel_id = ?; -``` - -## Replicate DDL operations - -Because Drainer cannot add the mark table to DDL operations, you can only use the one-way replication method to replicate DDL operations. - -For example, if DDL replication is enabled from cluster A to cluster B, then the replication is disabled from cluster B to cluster A. This means that all DDL operations are performed on cluster A. - -> **Note:** -> -> DDL operations cannot be executed on two clusters at the same time. When a DDL operation is executed, if any DML operation is being executed at the same time or any DML binlog is being replicated, the upstream and downstream table structures of the DML replication might be inconsistent. - -## Configure and enable bidirectional replication - -For bidirectional replication between cluster A and cluster B, assume that all DDL operations are executed on cluster A. On the replication path from cluster A to cluster B, add the following configuration to Drainer: - -{{< copyable "" >}} - -```toml -[syncer] -loopback-control = true -channel-id = 1 # Configures the same ID for both clusters to be replicated. -sync-ddl = true # Enables it if you need to perform DDL replication. - -[syncer.to] -# 1 means SyncFullColumn and 2 means SyncPartialColumn. -# If set to SyncPartialColumn, Drainer allows the downstream table -# structure to have more or fewer columns than the data to be replicated -# And remove the STRICT_TRANS_TABLES of the SQL mode to allow fewer columns, and insert zero values to the downstream. -sync-mode = 2 - -# Ignores the checkpoint table. -[[syncer.ignore-table]] -db-name = "tidb_binlog" -tbl-name = "checkpoint" -``` - -On the replication path from cluster B to cluster A, add the following configuration to Drainer: - -{{< copyable "" >}} - -```toml -[syncer] -loopback-control = true -channel-id = 1 # Configures the same ID for both clusters to be replicated. -sync-ddl = false # Disables it if you do not need to perform DDL replication. - -[syncer.to] -# 1 means SyncFullColumn and 2 means SyncPartialColumn. -# If set to SyncPartialColumn, Drainer allows the downstream table -# structure to have more or fewer columns than the data to be replicated -# And remove the STRICT_TRANS_TABLES of the SQL mode to allow fewer columns, and insert zero values to the downstream. -sync-mode = 2 - -# Ignores the checkpoint table. -[[syncer.ignore-table]] -db-name = "tidb_binlog" -tbl-name = "checkpoint" -``` diff --git a/tidb-binlog/binlog-consumer-client.md b/tidb-binlog/binlog-consumer-client.md deleted file mode 100644 index e77f63314cc24..0000000000000 --- a/tidb-binlog/binlog-consumer-client.md +++ /dev/null @@ -1,148 +0,0 @@ ---- -title: Binlog Consumer Client User Guide -summary: Use Binlog Consumer Client to consume TiDB secondary binlog data from Kafka and output the data in a specific format. -aliases: ['/tidb/dev/binlog-slave-client','/docs/dev/tidb-binlog/binlog-slave-client/','/docs/dev/reference/tidb-binlog/binlog-slave-client/','/docs/dev/reference/tools/tidb-binlog/binlog-slave-client/'] ---- - -# Binlog Consumer Client User Guide - -Binlog Consumer Client is used to consume TiDB secondary binlog data from Kafka and output the data in a specific format. Currently, Drainer supports multiple kinds of down streaming, including MySQL, TiDB, file and Kafka. But sometimes users have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced. - -## Configure Drainer - -Modify the configuration file of Drainer and set it to output the data to Kafka: - -``` -[syncer] -db-type = "kafka" - -[syncer.to] -# the Kafka address -kafka-addrs = "127.0.0.1:9092" -# the Kafka version -kafka-version = "2.4.0" -``` - -## Customized development - -### Data format - -Firstly, you need to obtain the format information of the data which is output to Kafka by Drainer: - -``` -// `Column` stores the column data in the corresponding variable based on the data type. -message Column { - // Indicates whether the data is null - optional bool is_null = 1 [ default = false ]; - // Stores `int` data - optional int64 int64_value = 2; - // Stores `uint`, `enum`, and `set` data - optional uint64 uint64_value = 3; - // Stores `float` and `double` data - optional double double_value = 4; - // Stores `bit`, `blob`, `binary` and `json` data - optional bytes bytes_value = 5; - // Stores `date`, `time`, `decimal`, `text`, `char` data - optional string string_value = 6; -} - -// `ColumnInfo` stores the column information, including the column name, type, and whether it is the primary key. -message ColumnInfo { - optional string name = 1 [ (gogoproto.nullable) = false ]; - // the lower case column field type in MySQL - // https://dev.mysql.com/doc/refman/8.0/en/data-types.html - // for the `numeric` type: int bigint smallint tinyint float double decimal bit - // for the `string` type: text longtext mediumtext char tinytext varchar - // blob longblob mediumblob binary tinyblob varbinary - // enum set - // for the `json` type: json - optional string mysql_type = 2 [ (gogoproto.nullable) = false ]; - optional bool is_primary_key = 3 [ (gogoproto.nullable) = false ]; -} - -// `Row` stores the actual data of a row. -message Row { repeated Column columns = 1; } - -// `MutationType` indicates the DML type. -enum MutationType { - Insert = 0; - Update = 1; - Delete = 2; -} - -// `Table` contains mutations in a table. -message Table { - optional string schema_name = 1; - optional string table_name = 2; - repeated ColumnInfo column_info = 3; - repeated TableMutation mutations = 4; -} - -// `TableMutation` stores mutations of a row. -message TableMutation { - required MutationType type = 1; - // data after modification - required Row row = 2; - // data before modification. It only takes effect for `Update MutationType`. - optional Row change_row = 3; -} - -// `DMLData` stores all the mutations caused by DML in a transaction. -message DMLData { - // `tables` contains all the table changes in the transaction. - repeated Table tables = 1; -} - -// `DDLData` stores the DDL information. -message DDLData { - // the database used currently - optional string schema_name = 1; - // the relates table - optional string table_name = 2; - // `ddl_query` is the original DDL statement query. - optional bytes ddl_query = 3; -} - -// `BinlogType` indicates the binlog type, including DML and DDL. -enum BinlogType { - DML = 0; // Has `dml_data` - DDL = 1; // Has `ddl_query` -} - -// `Binlog` stores all the changes in a transaction. Kafka stores the serialized result of the structure data. -message Binlog { - optional BinlogType type = 1 [ (gogoproto.nullable) = false ]; - optional int64 commit_ts = 2 [ (gogoproto.nullable) = false ]; - optional DMLData dml_data = 3; - optional DDLData ddl_data = 4; -} -``` - -For the definition of the data format, see [`secondary_binlog.proto`](https://github.com/pingcap/tidb/blob/master/pkg/tidb-binlog/proto/proto/secondary_binlog.proto) - -### Driver - -The [TiDB-Tools](https://github.com/pingcap/tidb-tools/) project provides [Driver](https://github.com/pingcap/tidb/tree/master/pkg/tidb-binlog/driver), which is used to read the binlog data in Kafka. It has the following features: - -* Read the Kafka data. -* Locate the binlog stored in Kafka based on `commit ts`. - -You need to configure the following information when using Driver: - -* `KafkaAddr`: the address of the Kafka cluster -* `CommitTS`: from which `commit ts` to start reading the binlog -* `Offset`: from which Kafka `offset` to start reading data. If `CommitTS` is set, you needn't configure this parameter. -* `ClusterID`: the cluster ID of the TiDB cluster -* `Topic`: the topic name of Kafka. If Topic is empty, use the default name in Drainer `_obinlog`. - -You can use Driver by quoting the Driver code in package and refer to the example code provided by Driver to learn how to use Driver and parse the binlog data. - -Currently, two examples are provided: - -* Using Driver to replicate data to MySQL. This example shows how to convert a binlog to SQL -* Using Driver to print data - -> **Note:** -> -> - The example code only shows how to use Driver. If you want to use Driver in the production environment, you need to optimize the code. -> - Currently, only the Golang version of Driver and example code are available. If you want to use other languages, you need to generate the code file in the corresponding language based on the binlog proto file and develop an application to read the binlog data in Kafka, parse the data, and output the data to the downstream. You are also welcome to optimize the example code and submit the example code of other languages to [TiDB-Tools](https://github.com/pingcap/tidb-tools). diff --git a/tidb-binlog/binlog-control.md b/tidb-binlog/binlog-control.md deleted file mode 100644 index c0aa292fbb577..0000000000000 --- a/tidb-binlog/binlog-control.md +++ /dev/null @@ -1,111 +0,0 @@ ---- -title: binlogctl -summary: Learns how to use `binlogctl`. -aliases: ['/docs/dev/tidb-binlog/binlog-control/'] ---- - -# binlogctl - -[Binlog Control](https://github.com/pingcap/tidb-binlog/tree/master/binlogctl) (`binlogctl` for short) is a command line tool for [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) (deprecated). You can use `binlogctl` to manage TiDB Binlog clusters. - -You can use `binlogctl` to: - -* Check the state of Pump or Drainer -* Pause or close Pump or Drainer -* Handle the abnormal state of Pump or Drainer - -The following are its usage scenarios: - -* An error occurs during data replication or you need to check the running state of Pump or Drainer. -* You need to pause or close Pump or Drainer when maintaining the cluster. -* A Pump or Drainer process exits abnormally, while the node state is not updated or is unexpected. This affects the data replication task. - -## Download `binlogctl` - -`binlogctl` is included in the TiDB Toolkit. To download the TiDB Toolkit, see [Download TiDB Tools](/download-ecosystem-tools.md). - -## Descriptions - -Command line parameters: - -``` -Usage of binlogctl: - -V prints version and exit - -cmd string - operator: "generate_meta", "pumps", "drainers", "update-pump", "update-drainer", "pause-pump", "pause-drainer", "offline-pump", "offline-drainer", "encrypt" (default "pumps") - -data-dir string - meta directory path (default "binlog_position") - -node-id string - id of node, used to update some nodes with operations update-pump, update-drainer, pause-pump, pause-drainer, offline-pump and offline-drainer - -pd-urls string - a comma separated list of PD endpoints (default "http://127.0.0.1:2379") - -show-offline-nodes - include offline nodes when querying pumps/drainers - -ssl-ca string - Path of file that contains list of trusted SSL CAs for connection with cluster components. - -ssl-cert string - Path of file that contains X509 certificate in PEM format for connection with cluster components. - -ssl-key string - Path of file that contains X509 key in PEM format for connection with cluster components. - -state string - set node's state, can be set to online, pausing, paused, closing or offline. - -text string - text to be encrypted when using encrypt command - -time-zone Asia/Shanghai - set time zone if you want to save time info in savepoint file; for example, Asia/Shanghai for CST time, `Local` for local time -``` - -Command examples: - -- Check the state of all the Pump or Drainer nodes. - - Set `cmd` to `pumps` or `drainers`. For example: - - {{< copyable "shell-regular" >}} - - ```bash - bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pumps - ``` - - ``` - [2019/04/28 09:29:59.016 +00:00] [INFO] [nodes.go:48] ["query node"] [type=pump] [node="{NodeID: 1.1.1.1:8250, Addr: pump:8250, State: online, MaxCommitTS: 408012403141509121, UpdateTime: 2019-04-28 09:29:57 +0000 UTC}"] - ``` - - {{< copyable "shell-regular" >}} - - ```bash - bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd drainers - ``` - - ``` - [2019/04/28 09:29:59.016 +00:00] [INFO] [nodes.go:48] ["query node"] [type=drainer] [node="{NodeID: 1.1.1.1:8249, Addr: 1.1.1.1:8249, State: online, MaxCommitTS: 408012403141509121, UpdateTime: 2019-04-28 09:29:57 +0000 UTC}"] - ``` - -- Pause or close Pump or Drainer. - - You can use the following commands to pause or close services: - - | Command | Description | Example | - | :--------------- | :------------- | :------------------------------------------------------------------------------------------------| - | pause-pump | Pause Pump | `bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pause-pump -node-id ip-127-0-0-1:8250` | - | pause-drainer | Pause Drainer | `bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd pause-drainer -node-id ip-127-0-0-1:8249` | - | offline-pump | Close Pump | `bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd offline-pump -node-id ip-127-0-0-1:8250` | - | offline-drainer | Close Drainer | `bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd offline-drainer -node-id ip-127-0-0-1:8249` | - - `binlogctl` sends the HTTP request to the Pump or Drainer node. After receiving the request, the node executes the exiting procedures accordingly. - -- Modify the state of a Pump or Drainer node in abnormal states. - - When a Pump or Drainer node runs normally or when it is paused or closed in the normal process, it is in the normal state. In abnormal states, the Pump or Drainer node cannot correctly maintain its state. This affects data replication tasks. In this case, use `binlogctl` to repair the state information. - - To update the state of a Pump or Drainer node, set `cmd` to `update-pump` or `update-drainer`. The state can be `paused` or `offline`. For example: - - {{< copyable "shell-regular" >}} - - ```bash - bin/binlogctl -pd-urls=http://127.0.0.1:2379 -cmd update-pump -node-id ip-127-0-0-1:8250 -state paused - ``` - - > **Note:** - > - > When a Pump or Drainer node runs normally, it regularly updates its state to PD. The above command directly modifies the Pump or Drainer state saved in PD; therefore, do not use the command when the Pump or Drainer node runs normally. For more information, refer to [TiDB Binlog FAQ](/tidb-binlog/tidb-binlog-faq.md). diff --git a/tidb-binlog/deploy-tidb-binlog.md b/tidb-binlog/deploy-tidb-binlog.md deleted file mode 100644 index cfc17ee160f50..0000000000000 --- a/tidb-binlog/deploy-tidb-binlog.md +++ /dev/null @@ -1,391 +0,0 @@ ---- -title: TiDB Binlog Cluster Deployment -summary: Learn how to deploy TiDB Binlog cluster. -aliases: ['/docs/dev/tidb-binlog/deploy-tidb-binlog/','/docs/dev/reference/tidb-binlog/deploy/','/docs/dev/how-to/deploy/tidb-binlog/'] ---- - -# TiDB Binlog Cluster Deployment - -This document describes how to [deploy TiDB Binlog using a Binary package](#deploy-tidb-binlog-using-a-binary-package). - -> **Warning:** -> -> Starting from TiDB v8.3.0, TiDB Binlog is deprecated, and is planned to be removed in a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). - -## Hardware requirements - -Pump and Drainer are deployed and operate on 64-bit universal hardware server platforms with Intel x86-64 architecture. - -In environments of development, testing and production, the requirements on server hardware are as follows: - -| Service | The Number of Servers | CPU | Disk | Memory | -| :-------- | :-------- | :-------- | :--------------- | :------ | -| Pump | 3 | 8 core+ | SSD, 200 GB+ | 16G | -| Drainer | 1 | 8 core+ | SAS, 100 GB+ (If binlogs are output as local files, the disk size depends on how long these files are retained.) | 16G | - -## Deploy TiDB Binlog using TiUP - -It is recommended to deploy TiDB Binlog using TiUP. To do that, when deploying TiDB using TiUP, you need to add the node information of `drainer` and `pump` of TiDB Binlog in [TiDB Binlog Deployment Topology](/tidb-binlog-deployment-topology.md). For detailed deployment information, refer to [Deploy a TiDB Cluster Using TiUP](/production-deployment-using-tiup.md). - -## Deploy TiDB Binlog using a binary package - -### Download the official binary package - -The binary package of TiDB Binlog is included in the TiDB Toolkit. To download the TiDB Toolkit, see [Download TiDB Tools](/download-ecosystem-tools.md). - -### The usage example - -Assuming that you have three PD nodes, one TiDB node, two Pump nodes, and one Drainer node, the information of each node is as follows: - -| Node | IP | -| :---------|:------------ | -| TiDB | 192.168.0.10 | -| PD1 | 192.168.0.16 | -| PD2 | 192.168.0.15 | -| PD3 | 192.168.0.14 | -| Pump | 192.168.0.11 | -| Pump | 192.168.0.12 | -| Drainer | 192.168.0.13 | - -The following part shows how to use Pump and Drainer based on the nodes above. - -1. Deploy Pump using the binary. - - - To view the command line parameters of Pump, execute `./pump -help`: - - ```bash - Usage of Pump: - -L string - the output information level of logs: debug, info, warn, error, fatal ("info" by default) - -V - the print version information - -addr string - the RPC address through which Pump provides the service (-addr="192.168.0.11:8250") - -advertise-addr string - the RPC address through which Pump provides the external service (-advertise-addr="192.168.0.11:8250") - -config string - the path of the configuration file. If you specify the configuration file, Pump reads the configuration in the configuration file first. If the corresponding configuration also exits in the command line parameters, Pump uses the configuration of the command line parameters to cover that of the configuration file. - -data-dir string - the path where the Pump data is stored - -gc int - the number of days to retain the data in Pump ("7" by default) - -heartbeat-interval int - the interval of the heartbeats Pump sends to PD (in seconds) - -log-file string - the file path of logs - -log-rotate string - the switch frequency of logs (hour/day) - -metrics-addr string - the Prometheus Pushgateway address. If not set, it is forbidden to report the monitoring metrics. - -metrics-interval int - the report frequency of the monitoring metrics ("15" by default, in seconds) - -node-id string - the unique ID of a Pump node. If you do not specify this ID, the system automatically generates an ID based on the host name and listening port. - -pd-urls string - the address of the PD cluster nodes (-pd-urls="http://192.168.0.16:2379,http://192.168.0.15:2379,http://192.168.0.14:2379") - -fake-binlog-interval int - the frequency at which a Pump node generates fake binlog ("3" by default, in seconds) - ``` - - - Taking deploying Pump on "192.168.0.11" as an example, the Pump configuration file is as follows: - - ```toml - # Pump Configuration - - # the bound address of Pump - addr = "192.168.0.11:8250" - - # the address through which Pump provides the service - advertise-addr = "192.168.0.11:8250" - - # the number of days to retain the data in Pump ("7" by default) - gc = 7 - - # the directory where the Pump data is stored - data-dir = "data.pump" - - # the interval of the heartbeats Pump sends to PD (in seconds) - heartbeat-interval = 2 - - # the address of the PD cluster nodes (each separated by a comma with no whitespace) - pd-urls = "http://192.168.0.16:2379,http://192.168.0.15:2379,http://192.168.0.14:2379" - - # [security] - # This section is generally commented out if no special security settings are required. - # The file path containing a list of trusted SSL CAs connected to the cluster. - # ssl-ca = "/path/to/ca.pem" - # The path to the X509 certificate in PEM format that is connected to the cluster. - # ssl-cert = "/path/to/drainer.pem" - # The path to the X509 key in PEM format that is connected to the cluster. - # ssl-key = "/path/to/drainer-key.pem" - - # [storage] - # Set to true (by default) to guarantee reliability by ensuring binlog data is flushed to the disk - # sync-log = true - - # When the available disk capacity is less than the set value, Pump stops writing data. - # 42 MB -> 42000000, 42 mib -> 44040192 - # default: 10 gib - # stop-write-at-available-space = "10 gib" - # The LSM DB settings embedded in Pump. Unless you know this part well, it is usually commented out. - # [storage.kv] - # block-cache-capacity = 8388608 - # block-restart-interval = 16 - # block-size = 4096 - # compaction-L0-trigger = 8 - # compaction-table-size = 67108864 - # compaction-total-size = 536870912 - # compaction-total-size-multiplier = 8.0 - # write-buffer = 67108864 - # write-L0-pause-trigger = 24 - # write-L0-slowdown-trigger = 17 - ``` - - - The example of starting Pump: - - {{< copyable "shell-regular" >}} - - ```bash - ./pump -config pump.toml - ``` - - If the command line parameters is the same with the configuration file parameters, the values of command line parameters are used. - -2. Deploy Drainer using binary. - - - To view the command line parameters of Drainer, execute `./drainer -help`: - - ```bash - Usage of Drainer: - -L string - the output information level of logs: debug, info, warn, error, fatal ("info" by default) - -V - the print version information - -addr string - the address through which Drainer provides the service (-addr="192.168.0.13:8249") - -c int - the number of the concurrency of the downstream for replication. The bigger the value, the better throughput performance of the concurrency ("1" by default). - -cache-binlog-count int - the limit on the number of binlog items in the cache ("8" by default) - If a large single binlog item in the upstream causes OOM in Drainer, try to lower the value of this parameter to reduce memory usage. - -config string - the directory of the configuration file. Drainer reads the configuration file first. - If the corresponding configuration exists in the command line parameters, Drainer uses the configuration of the command line parameters to cover that of the configuration file. - -data-dir string - the directory where the Drainer data is stored ("data.drainer" by default) - -dest-db-type string - the downstream service type of Drainer - The value can be "mysql", "tidb", "kafka", and "file". ("mysql" by default) - -detect-interval int - the interval of checking the online Pump in PD ("10" by default, in seconds) - -disable-detect - whether to disable the conflict monitoring - -disable-dispatch - whether to disable the SQL feature of splitting a single binlog file. If it is set to "true", each binlog file is restored to a single transaction for replication based on the order of binlogs. - It is set to "False", when the downstream is MySQL. - -ignore-schemas string - the db filter list ("INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test" by default) - It does not support the Rename DDL operation on tables of `ignore schemas`. - -initial-commit-ts - If Drainer does not have the related breakpoint information, you can configure the related breakpoint information using this parameter. ("-1" by default) - If the value of this parameter is `-1`, Drainer automatically obtains the latest timestamp from PD. - -log-file string - the path of the log file - -log-rotate string - the switch frequency of log files, hour/day - -metrics-addr string - the Prometheus Pushgateway address - It it is not set, the monitoring metrics are not reported. - -metrics-interval int - the report frequency of the monitoring metrics ("15" by default, in seconds) - -node-id string - the unique ID of a Drainer node. If you do not specify this ID, the system automatically generates an ID based on the host name and listening port. - -pd-urls string - the address of the PD cluster nodes (-pd-urls="http://192.168.0.16:2379,http://192.168.0.15:2379,http://192.168.0.14:2379") - -safe-mode - Whether to enable safe mode so that data can be written into the downstream MySQL/TiDB repeatedly. - This mode replaces the `INSERT` statement with the `REPLACE` statement and splits the `UPDATE` statement into `DELETE` plus `REPLACE`. - -txn-batch int - the number of SQL statements of a transaction which are output to the downstream database ("1" by default) - ``` - - - Taking deploying Drainer on "192.168.0.13" as an example, the Drainer configuration file is as follows: - - ```toml - # Drainer Configuration. - - # the address through which Drainer provides the service ("192.168.0.13:8249") - addr = "192.168.0.13:8249" - - # the address through which Drainer provides the external service - advertise-addr = "192.168.0.13:8249" - - # the interval of checking the online Pump in PD ("10" by default, in seconds) - detect-interval = 10 - - # the directory where the Drainer data is stored "data.drainer" by default) - data-dir = "data.drainer" - - # the address of the PD cluster nodes (each separated by a comma with no whitespace) - pd-urls = "http://192.168.0.16:2379,http://192.168.0.15:2379,http://192.168.0.14:2379" - - # the directory of the log file - log-file = "drainer.log" - - # Drainer compresses the data when it gets the binlog from Pump. The value can be "gzip". If it is not configured, it will not be compressed - # compressor = "gzip" - - # [security] - # This section is generally commented out if no special security settings are required. - # The file path containing a list of trusted SSL CAs connected to the cluster. - # ssl-ca = "/path/to/ca.pem" - # The path to the X509 certificate in PEM format that is connected to the cluster. - # ssl-cert = "/path/to/pump.pem" - # The path to the X509 key in PEM format that is connected to the cluster. - # ssl-key = "/path/to/pump-key.pem" - - # Syncer Configuration - [syncer] - # If the item is set, the sql-mode will be used to parse the DDL statement. - # If the downstream database is MySQL or TiDB, then the downstream sql-mode - # is also set to this value. - # sql-mode = "STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION" - - # the number of SQL statements of a transaction that are output to the downstream database ("20" by default) - txn-batch = 20 - - # the number of the concurrency of the downstream for replication. The bigger the value, - # the better throughput performance of the concurrency ("16" by default) - worker-count = 16 - - # whether to disable the SQL feature of splitting a single binlog file. If it is set to "true", - # each binlog file is restored to a single transaction for replication based on the order of binlogs. - # If the downstream service is MySQL, set it to "False". - disable-dispatch = false - - # In safe mode, data can be written into the downstream MySQL/TiDB repeatedly. - # This mode replaces the `INSERT` statement with the `REPLACE` statement and replaces the `UPDATE` statement with `DELETE` plus `REPLACE` statements. - safe-mode = false - - # the downstream service type of Drainer ("mysql" by default) - # Valid value: "mysql", "tidb", "file", and "kafka". - db-type = "mysql" - - # If `commit ts` of the transaction is in the list, the transaction is filtered and not replicated to the downstream. - ignore-txn-commit-ts = [] - - # the db filter list ("INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql,test" by default) - # Does not support the Rename DDL operation on tables of `ignore schemas`. - ignore-schemas = "INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql" - - # `replicate-do-db` has priority over `replicate-do-table`. When they have the same `db` name, - # regular expressions are supported for configuration. - # The regular expression should start with "~". - - # replicate-do-db = ["~^b.*","s1"] - - # [syncer.relay] - # It saves the directory of the relay log. The relay log is not enabled if the value is empty. - # The configuration only comes to effect if the downstream is TiDB or MySQL. - # log-dir = "" - # the maximum size of each file - # max-file-size = 10485760 - - # [[syncer.replicate-do-table]] - # db-name ="test" - # tbl-name = "log" - - # [[syncer.replicate-do-table]] - # db-name ="test" - # tbl-name = "~^a.*" - - # Ignore the replication of some tables - # [[syncer.ignore-table]] - # db-name = "test" - # tbl-name = "log" - - # the server parameters of the downstream database when `db-type` is set to "mysql" - [syncer.to] - host = "192.168.0.13" - user = "root" - # If you do not want to set a cleartext `password` in the configuration file, you can create `encrypted_password` using `./binlogctl -cmd encrypt -text string`. - # When you have created an `encrypted_password` that is not empty, the `password` above will be ignored, because `encrypted_password` and `password` cannot take effect at the same time. - password = "" - encrypted_password = "" - port = 3306 - - [syncer.to.checkpoint] - # When the checkpoint type is "mysql" or "tidb", this option can be - # enabled to change the database that saves the checkpoint - # schema = "tidb_binlog" - # Currently only the "mysql" and "tidb" checkpoint types are supported - # You can remove the comment tag to control where to save the checkpoint - # The default method of saving the checkpoint for the downstream db-type: - # mysql/tidb -> in the downstream MySQL or TiDB database - # file/kafka -> file in `data-dir` - # type = "mysql" - # host = "127.0.0.1" - # user = "root" - # password = "" - # `encrypted_password` is encrypted using `./binlogctl -cmd encrypt -text string`. - # When `encrypted_password` is not empty, the `password` above will be ignored. - # encrypted_password = "" - # port = 3306 - - # the directory where the binlog file is stored when `db-type` is set to `file` - # [syncer.to] - # dir = "data.drainer" - - # the Kafka configuration when `db-type` is set to "kafka" - # [syncer.to] - # only one of kafka-addrs and zookeeper-addrs is needed. If both are present, the program gives priority - # to the kafka address in zookeeper - # zookeeper-addrs = "127.0.0.1:2181" - # kafka-addrs = "127.0.0.1:9092" - # kafka-version = "0.8.2.0" - # The maximum number of messages (number of binlogs) in a broker request. If it is left blank or a value smaller than 0 is configured, the default value 1024 is used. - # kafka-max-messages = 1024 - # The maximum size of a broker request (unit: byte). The default value is 1 GiB and the maximum value is 2 GiB. - # kafka-max-message-size = 1073741824 - - # the topic name of the Kafka cluster that saves the binlog data. The default value is _obinlog. - # To run multiple Drainers to replicate data to the same Kafka cluster, you need to set different `topic-name`s for each Drainer. - # topic-name = "" - ``` - - - Starting Drainer: - - > **Note:** - > - > If the downstream is MySQL/TiDB, to guarantee the data integrity, you need to obtain the `initial-commit-ts` value and make a full backup of the data and restore the data before the initial start of Drainer. - - When Drainer is started for the first time, use the `initial-commit-ts` parameter. - - {{< copyable "shell-regular" >}} - - ```bash - ./drainer -config drainer.toml -initial-commit-ts {initial-commit-ts} - ``` - - If the command line parameter and the configuration file parameter are the same, the parameter value in the command line is used. - -3. Starting TiDB server: - - - After starting Pump and Drainer, start TiDB server with binlog enabled by adding this section to your config file for TiDB server: - - ``` - [binlog] - enable=true - ``` - - - TiDB server will obtain the addresses of registered Pumps from PD and will stream data to all of them. If there are no registered Pump instances, TiDB server will refuse to start or will block starting until a Pump instance comes online. - -> **Note:** -> -> - When TiDB is running, you need to guarantee that at least one Pump is running normally. -> - To enable the TiDB Binlog service in TiDB server, use the `-enable-binlog` startup parameter in TiDB, or add enable=true to the [binlog] section of the TiDB server configuration file. -> - Make sure that the TiDB Binlog service is enabled in all TiDB instances in a same cluster, otherwise upstream and downstream data inconsistency might occur during data replication. If you want to temporarily run a TiDB instance where the TiDB Binlog service is not enabled, set `run_ddl=false` in the TiDB configuration file. -> - Drainer does not support the `rename` DDL operation on the table of `ignore schemas` (the schemas in the filter list). -> - If you want to start Drainer in an existing TiDB cluster, generally you need to make a full backup of the cluster data, obtain **snapshot timestamp**, import the data to the target database, and then start Drainer to replicate the incremental data from the corresponding **snapshot timestamp**. -> - When the downstream database is TiDB or MySQL, ensure that the `sql_mode` in the upstream and downstream databases are consistent. In other words, the `sql_mode` should be the same when each SQL statement is executed in the upstream and replicated to the downstream. You can execute the `select @@sql_mode;` statement in the upstream and downstream respectively to compare `sql_mode`. -> - When a DDL statement is supported in the upstream but incompatible with the downstream, Drainer fails to replicate data. An example is to replicate the `CREATE TABLE t1(a INT) ROW_FORMAT=FIXED;` statement when the downstream database MySQL uses the InnoDB engine. In this case, you can configure [skipping transactions](/tidb-binlog/tidb-binlog-faq.md#what-can-i-do-when-some-ddl-statements-supported-by-the-upstream-database-cause-error-when-executed-in-the-downstream-database) in Drainer, and manually execute compatible statements in the downstream database. diff --git a/tidb-binlog/get-started-with-tidb-binlog.md b/tidb-binlog/get-started-with-tidb-binlog.md deleted file mode 100644 index 2a3c84faac804..0000000000000 --- a/tidb-binlog/get-started-with-tidb-binlog.md +++ /dev/null @@ -1,420 +0,0 @@ ---- -title: TiDB Binlog Tutorial -summary: Learn to deploy TiDB Binlog with a simple TiDB cluster. -aliases: ['/docs/dev/get-started-with-tidb-binlog/','/docs/dev/how-to/get-started/tidb-binlog/'] ---- - -# TiDB Binlog Tutorial - -This tutorial starts with a simple [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) (deprecated) deployment with a single node of each component (Placement Driver, TiKV Server, TiDB Server, Pump, and Drainer), set up to push data into a MariaDB Server instance. - -This tutorial is targeted toward users who have some familiarity with the [TiDB Architecture](/tidb-architecture.md), who may have already set up a TiDB cluster (not mandatory), and who wants to gain hands-on experience with TiDB Binlog. This tutorial is a good way to "kick the tires" of TiDB Binlog and to familiarize yourself with the concepts of its architecture. - -> **Warning:** -> -> The instructions to deploy TiDB in this tutorial should **not** be used to deploy TiDB in a production or development setting. - -This tutorial assumes you're using a modern Linux distribution on x86-64. A minimal CentOS 7 installation running in VMware is used in this tutorial for the examples. It's recommended that you start from a clean install, so that you aren't impacted by quirks of your existing environment. If you don't want to use local virtualization, you can easily start a CentOS 7 VM using your cloud service. - -## TiDB Binlog Overview - -TiDB Binlog is a solution to collect binary log data from TiDB and provide real-time data backup and replication. It pushes incremental data updates from a TiDB Server cluster into downstream platforms. - -You can use TiDB Binlog for incremental backups, to replicate data from one TiDB cluster to another, or to send TiDB updates through Kafka to a downstream platform of your choice. - -TiDB Binlog is particularly useful when you migrate data from MySQL or MariaDB to TiDB, in which case you may use the TiDB DM (Data Migration) platform to get data from a MySQL/MariaDB cluster into TiDB, and then use TiDB Binlog to keep a separate, downstream MySQL/MariaDB instance/cluster in sync with your TiDB cluster. TiDB Binlog enables application traffic to TiDB to be pushed to a downstream MySQL or MariaDB instance/cluster, which reduces the risk of a migration to TiDB because you can easily revert the application to MySQL or MariaDB without downtime or data loss. - -See [TiDB Binlog Cluster User Guide](/tidb-binlog/tidb-binlog-overview.md) for more information. - -## Architecture - -TiDB Binlog comprises two components: the **Pump** and the **Drainer**. Several Pump nodes make up a pump cluster. Each Pump node connects to TiDB Server instances and receives updates made to each of the TiDB Server instances in a cluster. A Drainer connects to the Pump cluster and transforms the received updates into the correct format for a particular downstream destination, for example, Kafka, another TiDB Cluster or a MySQL/MariaDB server. - -![TiDB-Binlog architecture](/media/tidb-binlog-cluster-architecture.png) - -The clustered architecture of Pump ensures that updates won't be lost as new TiDB Server instances join or leave the TiDB Cluster or Pump nodes join or leave the Pump cluster. - -## Installation - -We're using MariaDB Server in this case instead of MySQL Server because RHEL/CentOS 7 includes MariaDB Server in their default package repositories. We'll need the client as well as the server for later use. Let's install them now: - -```bash -sudo yum install -y mariadb-server -``` - -```bash -curl -L https://download.pingcap.org/tidb-community-server-v8.3.0-linux-amd64.tar.gz | tar xzf - -cd tidb-latest-linux-amd64 -``` - -Expected output: - -``` -[kolbe@localhost ~]$ curl -LO https://download.pingcap.org/tidb-latest-linux-amd64.tar.gz | tar xzf - - % Total % Received % Xferd Average Speed Time Time Time Current - Dload Upload Total Spent Left Speed -100 368M 100 368M 0 0 8394k 0 0:00:44 0:00:44 --:--:-- 11.1M -[kolbe@localhost ~]$ cd tidb-latest-linux-amd64 -[kolbe@localhost tidb-latest-linux-amd64]$ -``` - -## Configuration - -Now we'll start a simple TiDB cluster, with a single instance for each of `pd-server`, `tikv-server`, and `tidb-server`. - -Populate the config files using: - -```bash -printf > pd.toml %s\\n 'log-file="pd.log"' 'data-dir="pd.data"' -printf > tikv.toml %s\\n 'log-file="tikv.log"' '[storage]' 'data-dir="tikv.data"' '[pd]' 'endpoints=["127.0.0.1:2379"]' '[rocksdb]' max-open-files=1024 '[raftdb]' max-open-files=1024 -printf > pump.toml %s\\n 'log-file="pump.log"' 'data-dir="pump.data"' 'addr="127.0.0.1:8250"' 'advertise-addr="127.0.0.1:8250"' 'pd-urls="http://127.0.0.1:2379"' -printf > tidb.toml %s\\n 'store="tikv"' 'path="127.0.0.1:2379"' '[log.file]' 'filename="tidb.log"' '[binlog]' 'enable=true' -printf > drainer.toml %s\\n 'log-file="drainer.log"' '[syncer]' 'db-type="mysql"' '[syncer.to]' 'host="127.0.0.1"' 'user="root"' 'password=""' 'port=3306' -``` - -Use the following commands to see the configuration details: - -```bash -for f in *.toml; do echo "$f:"; cat "$f"; echo; done -``` - -Expected output: - -``` -drainer.toml: -log-file="drainer.log" -[syncer] -db-type="mysql" -[syncer.to] -host="127.0.0.1" -user="root" -password="" -port=3306 - -pd.toml: -log-file="pd.log" -data-dir="pd.data" - -pump.toml: -log-file="pump.log" -data-dir="pump.data" -addr="127.0.0.1:8250" -advertise-addr="127.0.0.1:8250" -pd-urls="http://127.0.0.1:2379" - -tidb.toml: -store="tikv" -path="127.0.0.1:2379" -[log.file] -filename="tidb.log" -[binlog] -enable=true - -tikv.toml: -log-file="tikv.log" -[storage] -data-dir="tikv.data" -[pd] -endpoints=["127.0.0.1:2379"] -[rocksdb] -max-open-files=1024 -[raftdb] -max-open-files=1024 -``` - -## Bootstrapping - -Now we can start each component. This is best done in a specific order - firstly the Placement Driver (PD), then TiKV Server, then Pump (because TiDB must connect to the Pump service to send the binary log), and finally the TiDB Server. - -Start all the services using: - -```bash -./bin/pd-server --config=pd.toml &>pd.out & -./bin/tikv-server --config=tikv.toml &>tikv.out & -./pump --config=pump.toml &>pump.out & -sleep 3 -./bin/tidb-server --config=tidb.toml &>tidb.out & -``` - -Expected output: - -``` -[kolbe@localhost tidb-latest-linux-amd64]$ ./bin/pd-server --config=pd.toml &>pd.out & -[1] 20935 -[kolbe@localhost tidb-latest-linux-amd64]$ ./bin/tikv-server --config=tikv.toml &>tikv.out & -[2] 20944 -[kolbe@localhost tidb-latest-linux-amd64]$ ./pump --config=pump.toml &>pump.out & -[3] 21050 -[kolbe@localhost tidb-latest-linux-amd64]$ sleep 3 -[kolbe@localhost tidb-latest-linux-amd64]$ ./bin/tidb-server --config=tidb.toml &>tidb.out & -[4] 21058 -``` - -If you execute `jobs`, you should see a list of running daemons: - -``` -[kolbe@localhost tidb-latest-linux-amd64]$ jobs -[1] Running ./bin/pd-server --config=pd.toml &>pd.out & -[2] Running ./bin/tikv-server --config=tikv.toml &>tikv.out & -[3]- Running ./pump --config=pump.toml &>pump.out & -[4]+ Running ./bin/tidb-server --config=tidb.toml &>tidb.out & -``` - -If one of the services has failed to start (if you see "`Exit 1`" instead of "`Running`", for example), try to restart that individual service. - -## Connecting - -You should have all 4 components of our TiDB Cluster running now, and you can now connect to the TiDB Server on port 4000 using the MariaDB/MySQL command-line client: - -```bash -mysql -h 127.0.0.1 -P 4000 -u root -e 'select tidb_version()\G' -``` - -Expected output: - -``` -[kolbe@localhost tidb-latest-linux-amd64]$ mysql -h 127.0.0.1 -P 4000 -u root -e 'select tidb_version()\G' -*************************** 1. row *************************** -tidb_version(): Release Version: v3.0.0-beta.1-154-gd5afff70c -Git Commit Hash: d5afff70cdd825d5fab125c8e52e686cc5fb9a6e -Git Branch: master -UTC Build Time: 2019-04-24 03:10:00 -GoVersion: go version go1.12 linux/amd64 -Race Enabled: false -TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e -Check Table Before Drop: false -``` - -At this point we have a TiDB Cluster running, and we have `pump` reading binary logs from the cluster and storing them as relay logs in its data directory. The next step is to start a MariaDB server that `drainer` can write to. - -Start `drainer` using: - -```bash -sudo systemctl start mariadb -./drainer --config=drainer.toml &>drainer.out & -``` - -If you are using an operating system that makes it easier to install MySQL server, that's also OK. Just make sure it's listening on port 3306 and that you can either connect to it as user "root" with an empty password, or adjust drainer.toml as necessary. - -```bash -mysql -h 127.0.0.1 -P 3306 -u root -``` - -```sql -show databases; -``` - -Expected output: - -``` -[kolbe@localhost ~]$ mysql -h 127.0.0.1 -P 3306 -u root -Welcome to the MariaDB monitor. Commands end with ; or \g. -Your MariaDB connection id is 20 -Server version: 5.5.60-MariaDB MariaDB Server - -Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. - -Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. - -MariaDB [(none)]> show databases; -+--------------------+ -| Database | -+--------------------+ -| information_schema | -| mysql | -| performance_schema | -| test | -| tidb_binlog | -+--------------------+ -5 rows in set (0.01 sec) -``` - -Here we can already see the `tidb_binlog` database, which contains the `checkpoint` table used by `drainer` to record up to what point binary logs from the TiDB cluster have been applied. - -```sql -MariaDB [tidb_binlog]> use tidb_binlog; -Database changed -MariaDB [tidb_binlog]> select * from checkpoint; -+---------------------+---------------------------------------------+ -| clusterID | checkPoint | -+---------------------+---------------------------------------------+ -| 6678715361817107733 | {"commitTS":407637466476445697,"ts-map":{}} | -+---------------------+---------------------------------------------+ -1 row in set (0.00 sec) -``` - -Now, let's open another client connection to the TiDB server, so that we can create a table and insert some rows into it. (It's recommended that you do this under a GNU screen so you can keep multiple clients open at the same time.) - -```bash -mysql -h 127.0.0.1 -P 4000 --prompt='TiDB [\d]> ' -u root -``` - -```sql -create database tidbtest; -use tidbtest; -create table t1 (id int unsigned not null AUTO_INCREMENT primary key); -insert into t1 () values (),(),(),(),(); -select * from t1; -``` - -Expected output: - -``` -TiDB [(none)]> create database tidbtest; -Query OK, 0 rows affected (0.12 sec) - -TiDB [(none)]> use tidbtest; -Database changed -TiDB [tidbtest]> create table t1 (id int unsigned not null AUTO_INCREMENT primary key); -Query OK, 0 rows affected (0.11 sec) - -TiDB [tidbtest]> insert into t1 () values (),(),(),(),(); -Query OK, 5 rows affected (0.01 sec) -Records: 5 Duplicates: 0 Warnings: 0 - -TiDB [tidbtest]> select * from t1; -+----+ -| id | -+----+ -| 1 | -| 2 | -| 3 | -| 4 | -| 5 | -+----+ -5 rows in set (0.00 sec) -``` - -Switching back to the MariaDB client, we should find the new database, new table, and the newly inserted rows: - -```sql -use tidbtest; -show tables; -select * from t1; -``` - -Expected output: - -``` -MariaDB [(none)]> use tidbtest; -Reading table information for completion of table and column names -You can turn off this feature to get a quicker startup with -A - -Database changed -MariaDB [tidbtest]> show tables; -+--------------------+ -| Tables_in_tidbtest | -+--------------------+ -| t1 | -+--------------------+ -1 row in set (0.00 sec) - -MariaDB [tidbtest]> select * from t1; -+----+ -| id | -+----+ -| 1 | -| 2 | -| 3 | -| 4 | -| 5 | -+----+ -5 rows in set (0.00 sec) -``` - -You should see the same rows that you inserted into TiDB when querying the MariaDB server. Congratulations! You've just set up TiDB Binlog! - -## binlogctl - -Information about Pumps and Drainers that have joined the cluster is stored in PD. You can use the binlogctl tool query and manipulate information about their states. See [binlogctl guide](/tidb-binlog/binlog-control.md) for more information. - -Use `binlogctl` to get a view of the current status of Pumps and Drainers in the cluster: - -```bash -./binlogctl -cmd drainers -./binlogctl -cmd pumps -``` - -Expected output: - -``` -[kolbe@localhost tidb-latest-linux-amd64]$ ./binlogctl -cmd drainers -[2019/04/11 17:44:10.861 -04:00] [INFO] [nodes.go:47] ["query node"] [type=drainer] [node="{NodeID: localhost.localdomain:8249, Addr: 192.168.236.128:8249, State: online, MaxCommitTS: 407638907719778305, UpdateTime: 2019-04-11 17:44:10 -0400 EDT}"] - -[kolbe@localhost tidb-latest-linux-amd64]$ ./binlogctl -cmd pumps -[2019/04/11 17:44:13.904 -04:00] [INFO] [nodes.go:47] ["query node"] [type=pump] [node="{NodeID: localhost.localdomain:8250, Addr: 192.168.236.128:8250, State: online, MaxCommitTS: 407638914024079361, UpdateTime: 2019-04-11 17:44:13 -0400 EDT}"] -``` - -If you kill a Drainer, the cluster puts it in the "paused" state, which means that the cluster expects it to rejoin: - -```bash -pkill drainer -./binlogctl -cmd drainers -``` - -Expected output: - -``` -[kolbe@localhost tidb-latest-linux-amd64]$ pkill drainer -[kolbe@localhost tidb-latest-linux-amd64]$ ./binlogctl -cmd drainers -[2019/04/11 17:44:22.640 -04:00] [INFO] [nodes.go:47] ["query node"] [type=drainer] [node="{NodeID: localhost.localdomain:8249, Addr: 192.168.236.128:8249, State: paused, MaxCommitTS: 407638915597467649, UpdateTime: 2019-04-11 17:44:18 -0400 EDT}"] -``` - -You can use "NodeIDs" with `binlogctl` to control individual nodes. In this case, the NodeID of the drainer is "localhost.localdomain:8249" and the NodeID of the Pump is "localhost.localdomain:8250". - -The main use of `binlogctl` in this tutorial is likely to be in the event of a cluster restart. If you end all processes in the TiDB cluster and try to restart them (not including the downstream MySQL/MariaDB server or Drainer), Pump will refuse to start because it cannot contact Drainer and believe that Drainer is still "online". - -There are 3 solutions to this issue: - -- Stop Drainer using `binlogctl` instead of killing the process: - - ``` - ./binlogctl --pd-urls=http://127.0.0.1:2379 --cmd=drainers - ./binlogctl --pd-urls=http://127.0.0.1:2379 --cmd=offline-drainer --node-id=localhost.localdomain:8249 - ``` - -- Start Drainer _before_ starting Pump. -- Use `binlogctl` after starting PD (but before starting Drainer and Pump) to update the state of the paused Drainer: - - ``` - ./binlogctl --pd-urls=http://127.0.0.1:2379 --cmd=update-drainer --node-id=localhost.localdomain:8249 --state=offline - ``` - -## Cleanup - -To stop the TiDB cluster and TiDB Binlog processes, you can execute `pkill -P $$` in the shell where you started all the processes that form the cluster (pd-server, tikv-server, pump, tidb-server, drainer). To give each component enough time to shut down cleanly, it's helpful to stop them in a particular order: - -```bash -for p in tidb-server drainer pump tikv-server pd-server; do pkill "$p"; sleep 1; done -``` - -Expected output: - -``` -kolbe@localhost tidb-latest-linux-amd64]$ for p in tidb-server drainer pump tikv-server pd-server; do pkill "$p"; sleep 1; done -[4]- Done ./bin/tidb-server --config=tidb.toml &>tidb.out -[5]+ Done ./drainer --config=drainer.toml &>drainer.out -[3]+ Done ./pump --config=pump.toml &>pump.out -[2]+ Done ./bin/tikv-server --config=tikv.toml &>tikv.out -[1]+ Done ./bin/pd-server --config=pd.toml &>pd.out -``` - -If you wish to restart the cluster after all services exit, use the same commands you ran originally to start the services. As discussed in the [`binlogctl`](#binlogctl) section above, you'll need to start `drainer` before `pump`, and `pump` before `tidb-server`. - -```bash -./bin/pd-server --config=pd.toml &>pd.out & -./bin/tikv-server --config=tikv.toml &>tikv.out & -./drainer --config=drainer.toml &>drainer.out & -sleep 3 -./pump --config=pump.toml &>pump.out & -sleep 3 -./bin/tidb-server --config=tidb.toml &>tidb.out & -``` - -If any of the components fail to start, try to restart the failed individual component(s). - -## Conclusion - -In this tutorial, we've set up TiDB Binlog to replicate from a TiDB cluster to a downstream MariaDB server, using a cluster with a single Pump and a single Drainer. As we've seen, TiDB Binlog is a comprehensive platform for capturing and processing changes to a TiDB cluster. - -In a more robust development, testing, or production deployment, you'd have multiple TiDB servers for high availability and scaling purposes, and you'd use multiple Pump instances to ensure that application traffic to TiDB server instances is unaffected by problems in the Pump cluster. You may also use additional Drainer instances to push updates to different downstream platforms or to implement incremental backups. diff --git a/tidb-binlog/handle-tidb-binlog-errors.md b/tidb-binlog/handle-tidb-binlog-errors.md deleted file mode 100644 index 5ba106db49415..0000000000000 --- a/tidb-binlog/handle-tidb-binlog-errors.md +++ /dev/null @@ -1,48 +0,0 @@ ---- -title: TiDB Binlog Error Handling -summary: Learn how to handle TiDB Binlog errors. -aliases: ['/docs/dev/tidb-binlog/handle-tidb-binlog-errors/','/docs/dev/reference/tidb-binlog/troubleshoot/error-handling/'] ---- - -# TiDB Binlog Error Handling - -This document introduces common errors that you might encounter and solutions to these errors when you use TiDB Binlog. - -## `kafka server: Message was too large, server rejected it to avoid allocation error` is returned when Drainer replicates data to Kafka - -Cause: Executing a large transaction in TiDB generates binlog data of a large size, which might exceed Kafka's limit on the message size. - -Solution: Adjust the configuration parameters of Kafka as shown below: - -{{< copyable "" >}} - -``` -message.max.bytes=1073741824 -replica.fetch.max.bytes=1073741824 -fetch.message.max.bytes=1073741824 -``` - -## Pump returns `no space left on device` error - -Cause: The local disk space is insufficient for Pump to write binlog data normally. - -Solution: Clean up the disk space and then restart Pump. - -## `fail to notify all living drainer` is returned when Pump is started - -Cause: When Pump is started, it notifies all Drainer nodes that are in the `online` state. If it fails to notify Drainer, this error log is printed. - -Solution: Use the [binlogctl tool](/tidb-binlog/binlog-control.md) to check whether each Drainer node is normal or not. This is to ensure that all Drainer nodes that are in the `online` state are working normally. If the state of a Drainer node is not consistent with its actual working status, use the binlogctl tool to change its state and then restart Pump. - -## Data loss occurs during the TiDB Binlog replication - -You need to confirm that TiDB Binlog is enabled on all TiDB instances and runs normally. If the cluster version is later than v3.0, use the `curl {TiDB_IP}:{STATUS_PORT}/info/all` command to confirm the TiDB Binlog status on all TiDB instances. - -## When the upstream transaction is large, Pump reports an error `rpc error: code = ResourceExhausted desc = trying to send message larger than max (2191430008 vs. 2147483647)` - -This error occurs because the gRPC message sent by TiDB to Pump exceeds the size limit. You can adjust the maximum size of a gRPC message that Pump allows by specifying `max-message-size` when starting Pump. - -## Is there any cleaning mechanism for the incremental data of the file format output by Drainer? Will the data be deleted? - -- In Drainer v3.0.x, there is no cleaning mechanism for incremental data of the file format. -- In the v4.0.x version, there is a time-based data cleaning mechanism. For details, refer to [Drainer's `retention-time` configuration item](https://github.com/pingcap/tidb-binlog/blob/v4.0.9/cmd/drainer/drainer.toml#L153). diff --git a/tidb-binlog/maintain-tidb-binlog-cluster.md b/tidb-binlog/maintain-tidb-binlog-cluster.md deleted file mode 100644 index 5690e7dcf16eb..0000000000000 --- a/tidb-binlog/maintain-tidb-binlog-cluster.md +++ /dev/null @@ -1,140 +0,0 @@ ---- -title: TiDB Binlog Cluster Operations -summary: Learn how to operate the cluster version of TiDB Binlog. -aliases: ['/docs/dev/tidb-binlog/maintain-tidb-binlog-cluster/','/docs/dev/reference/tidb-binlog/maintain/','/docs/dev/how-to/maintain/tidb-binlog/','/docs/dev/reference/tools/tidb-binlog/maintain/'] ---- - -# TiDB Binlog Cluster Operations - -This document introduces the following TiDB Binlog cluster operations: - -+ The state of a Pump and Drainer nodes -+ Starting or exiting a Pump or Drainer process -+ Managing the TiDB Binlog cluster by using the binlogctl tool or by directly performing SQL operations in TiDB - -## Pump or Drainer state - -Pump or Drainer state description: - -* `online`: running normally -* `pausing`: in the pausing process -* `paused`: has been stopped -* `closing`: in the offline process -* `offline`: has been offline - -> **Note:** -> -> The state information of a Pump or Drainer node is maintained by the service itself and is regularly updated to the Placement Driver (PD). - -## Starting and exiting a Pump or Drainer process - -### Pump - -* Starting: When started, the Pump node notifies all Drainer nodes in the `online` state. If the notification is successful, the Pump node sets its state to `online`. Otherwise, the Pump node reports an error, sets its state to `paused` and exits the process. -* Exiting: The Pump node enters the `paused` or `offline` state before the process is exited normally; if the process is exited abnormally (caused by the `kill -9` command, process panic, crash), the node is still in the `online` state. - * Pause: You can pause a Pump process by using the `kill` command (not `kill -9`), pressing Ctrl+C or using the `pause-pump` command in the binlogctl tool. After receiving the pause instruction, the Pump node sets its state to `pausing`, stops receiving binlog write requests and stops providing binlog data to Drainer nodes. After all threads are safely exited, the Pump node updates its state to `paused` and exits the process. - * Offline: You can close a Pump process only by using the `offline-pump` command in the binlogctl tool. After receiving the offline instruction, the Pump node sets its state to `closing` and stops receiving the binlog write requests. The Pump node continues providing binlog to Drainer nodes until all binlog data is consumed by Drainer nodes. Then, the Pump node sets its state to `offline` and exits the process. - -### Drainer - -* Starting: When started, the Drainer node sets its state to `online` and tries to pull binlogs from all Pump nodes which are not in the `offline` state. If it fails to get the binlogs, it keeps trying. -* Exiting: The Drainer node enters the `paused` or `offline` state before the process is exited normally; if the process is exited abnormally (caused by `kill -9`, process panic, crash), the Drainer node is still in the `online` state. - * Pause: You can pause a Drainer process by using the `kill` command (not `kill -9`), pressing Ctrl+C or using the `pause-drainer` command in the binlogctl tool. After receiving the pause instruction, the Drainer node sets its state to `pausing` and stops pulling binlogs from Pump nodes. After all threads are safely exited, the Drainer node sets its state to `paused` and exits the process. - * Offline: You can close a Drainer process only by using the `offline-drainer` command in the binlogctl tool. After receiving the offline instruction, the Drainer node sets its state to `closing` and stops pulling binlogs from Pump nodes. After all threads are safely exited, the Drainer node updates its state to `offline` and exits the process. - -For how to pause, close, check, and modify the state of Drainer, see the [binlogctl guide](/tidb-binlog/binlog-control.md). - -## Use `binlogctl` to manage Pump/Drainer - -[`binlogctl`](https://github.com/pingcap/tidb-binlog/tree/master/binlogctl) is an operations tool for TiDB Binlog with the following features: - -* Checking the state of Pump or Drainer -* Pausing or closing Pump or Drainer -* Handling the abnormal state of Pump or Drainer - -For detailed usage of `binlogctl`, refer to [binlogctl overview](/tidb-binlog/binlog-control.md). - -## Use SQL statements to manage Pump or Drainer - -To view or modify binlog related states, execute corresponding SQL statements in TiDB. - -- Check whether binlog is enabled: - - {{< copyable "sql" >}} - - ```sql - show variables like "log_bin"; - ``` - - ``` - +---------------+-------+ - | Variable_name | Value | - +---------------+-------+ - | log_bin | 0 | - +---------------+-------+ - ``` - - When the Value is `0`, binlog is enabled. When the Value is `1`, binlog is disabled. - -- Check the status of all the Pump or Drainer nodes: - - {{< copyable "sql" >}} - - ```sql - show pump status; - ``` - - ``` - +--------|----------------|--------|--------------------|---------------------| - | NodeID | Address | State | Max_Commit_Ts | Update_Time | - +--------|----------------|--------|--------------------|---------------------| - | pump1 | 127.0.0.1:8250 | Online | 408553768673342237 | 2019-05-01 00:00:01 | - +--------|----------------|--------|--------------------|---------------------| - | pump2 | 127.0.0.2:8250 | Online | 408553768673342335 | 2019-05-01 00:00:02 | - +--------|----------------|--------|--------------------|---------------------| - ``` - - {{< copyable "sql" >}} - - ```sql - show drainer status; - ``` - - ``` - +----------|----------------|--------|--------------------|---------------------| - | NodeID | Address | State | Max_Commit_Ts | Update_Time | - +----------|----------------|--------|--------------------|---------------------| - | drainer1 | 127.0.0.3:8249 | Online | 408553768673342532 | 2019-05-01 00:00:03 | - +----------|----------------|--------|--------------------|---------------------| - | drainer2 | 127.0.0.4:8249 | Online | 408553768673345531 | 2019-05-01 00:00:04 | - +----------|----------------|--------|--------------------|---------------------| - ``` - -- Modify the state of a Pump or Drainer node in abnormal situations - - {{< copyable "sql" >}} - - ```sql - change pump to node_state ='paused' for node_id 'pump1'; - ``` - - ``` - Query OK, 0 rows affected (0.01 sec) - ``` - - {{< copyable "sql" >}} - - ```sql - change drainer to node_state ='paused' for node_id 'drainer1'; - ``` - - ``` - Query OK, 0 rows affected (0.01 sec) - ``` - - Executing the above SQL statements works the same as the `update-pump` or `update-drainer` commands in binlogctl. Use the above SQL statements **only** when the Pump or Drainer node is in abnormal situations. - -> **Note:** -> -> - Checking whether binlog is enabled and the running status of Pump or Drainer is supported in TiDB v2.1.7 and later versions. -> - Modifying the status of Pump or Drainer is supported in TiDB v3.0.0-rc.1 and later versions. This feature only supports modifying the status of Pump or Drainer nodes stored in PD. To pause or close the node, use the `binlogctl` tool. diff --git a/tidb-binlog/monitor-tidb-binlog-cluster.md b/tidb-binlog/monitor-tidb-binlog-cluster.md deleted file mode 100644 index a5bf64f19482a..0000000000000 --- a/tidb-binlog/monitor-tidb-binlog-cluster.md +++ /dev/null @@ -1,171 +0,0 @@ ---- -title: TiDB Binlog Monitoring -summary: Learn how to monitor the cluster version of TiDB Binlog. -aliases: ['/docs/dev/tidb-binlog/monitor-tidb-binlog-cluster/','/docs/dev/reference/tidb-binlog/monitor/','/docs/dev/how-to/monitor/tidb-binlog/'] ---- - -# TiDB Binlog Monitoring - -After you have deployed TiDB Binlog successfully, you can go to the Grafana Web (default address: , default account: admin, password: admin) to check the state of Pump and Drainer. - -## Monitoring metrics - -TiDB Binlog consists of two components: Pump and Drainer. This section shows the monitoring metrics of Pump and Drainer. - -### Pump monitoring metrics - -To understand the Pump monitoring metrics, check the following table: - -| Pump monitoring metrics | Description | -| --- | --- | -| Storage Size | Records the total disk space (capacity) and the available disk space (available)| -| Metadata | Records the biggest TSO (`gc_tso`) of the binlog that each Pump node can delete, and the biggest commit TSO (`max_commit_tso`) of the saved binlog | -| Write Binlog QPS by Instance | Shows QPS of writing binlog requests received by each Pump node | -| Write Binlog Latency | Records the latency time of each Pump node writing binlog | -| Storage Write Binlog Size | Shows the size of the binlog data written by Pump | -| Storage Write Binlog Latency | Records the latency time of the Pump storage module writing binlog | -| Pump Storage Error By Type | Records the number of errors encountered by Pump, counted based on the type of error | -| Query TiKV | The number of times that Pump queries the transaction status through TiKV | - -### Drainer monitoring metrics - -To understand the Drainer monitoring metrics, check the following table: - -| Drainer monitoring metrics | Description | -| --- | --- | -| Checkpoint TSO | Shows the biggest TSO time of the binlog that Drainer has already replicated into the downstream. You can get the lag by using the current time to subtract the binlog timestamp. But be noted that the timestamp is allocated by PD of the master cluster and is determined by the time of PD.| -| Pump Handle TSO | Records the biggest TSO time among the binlog files that Drainer obtains from each Pump node | -| Pull Binlog QPS by Pump NodeID | Shows the QPS when Drainer obtains binlog from each Pump node | -| 95% Binlog Reach Duration By Pump | Records the delay from the time when binlog is written into Pump to the time when the binlog is obtained by Drainer | -| Error By Type | Shows the number of errors encountered by Drainer, counted based on the type of error | -| SQL Query Time | Records the time it takes Drainer to execute the SQL statement in the downstream | -| Drainer Event | Shows the number of various types of events, including "ddl", "insert", "delete", "update", "flush", and "savepoint" | -| Execute Time | Records the time it takes to write binlog into the downstream syncing module | -| 95% Binlog Size | Shows the size of the binlog data that Drainer obtains from each Pump node | -| DDL Job Count | Records the number of DDL statements handled by Drainer | -| Queue Size | Records the work queue size in Drainer | - -## Alert rules - -This section gives the alert rules for TiDB Binlog. According to the severity level, TiDB Binlog alert rules are divided into three categories (from high to low): emergency-level, critical-level and warning-level. - -### Emergency-level alerts - -Emergency-level alerts are often caused by a service or node failure. Manual intervention is required immediately. - -#### `binlog_pump_storage_error_count` - -* Alert rule: - - `changes(binlog_pump_storage_error_count[1m]) > 0` - -* Description: - - Pump fails to write the binlog data to the local storage. - -* Solution: - - Check whether an error exists in the `pump_storage_error` monitoring and check the Pump log to find the causes. - -### Critical-level alerts - -For the critical-level alerts, a close watch on the abnormal metrics is required. - -#### `binlog_drainer_checkpoint_high_delay` - -* Alert rule: - - `(time() - binlog_drainer_checkpoint_tso / 1000) > 3600` - -* Description: - - The delay of Drainer replication exceeds one hour. - -* Solution: - - - Check whether it is too slow to obtain the data from Pump: - - You can check `handle tso` of Pump to get the time for the latest message of each Pump. Check whether a high latency exists for Pump and make sure the corresponding Pump is running normally. - - - Check whether it is too slow to replicate data in the downstream based on Drainer `event` and Drainer `execute latency`: - - - If Drainer `execute time` is too large, check the network bandwidth and latency between the machine with Drainer deployed and the machine with the target database deployed, and the state of the target database. - - If Drainer `execute time` is not too large and Drainer `event` is too small, add `work count` and `batch` and retry. - - - If the two solutions above cannot work, [get support](/support.md) from PingCAP or the community. - -### Warning-level alerts - -Warning-level alerts are a reminder for an issue or error. - -#### `binlog_pump_write_binlog_rpc_duration_seconds_bucket` - -* Alert rule: - - `histogram_quantile(0.9, rate(binlog_pump_rpc_duration_seconds_bucket{method="WriteBinlog"}[5m])) > 1` - -* Description: - - It takes too much time for Pump to handle the TiDB request of writing binlog. - -* Solution: - - - Verify the disk performance pressure and check the disk performance monitoring via `node exported`. - - If both `disk latency` and `util` are low, [get support](/support.md) from PingCAP or the community. - -#### `binlog_pump_storage_write_binlog_duration_time_bucket` - -* Alert rule: - - `histogram_quantile(0.9, rate(binlog_pump_storage_write_binlog_duration_time_bucket{type="batch"}[5m])) > 1` - -* Description: - - The time it takes for Pump to write the local binlog to the local disk. - -* Solution: - - Check the state of the local disk of Pump and fix the problem. - -#### `binlog_pump_storage_available_size_less_than_20G` - -* Alert rule: - - `binlog_pump_storage_storage_size_bytes{type="available"} < 20 * 1024 * 1024 * 1024` - -* Description: - - The available disk space of Pump is less than 20 GB. - -* Solution: - - Check whether Pump `gc_tso` is normal. If not, adjust the GC time configuration of Pump or get the corresponding Pump offline. - -#### `binlog_drainer_checkpoint_tso_no_change_for_1m` - -* Alert rule: - - `changes(binlog_drainer_checkpoint_tso[1m]) < 1` - -* Description: - - Drainer `checkpoint` has not been updated for one minute. - -* Solution: - - Check whether all the Pumps that are not offline are running normally. - -#### `binlog_drainer_execute_duration_time_more_than_10s` - -* Alert rule: - - `histogram_quantile(0.9, rate(binlog_drainer_execute_duration_time_bucket[1m])) > 10` - -* Description: - - The transaction time it takes Drainer to replicate data to TiDB. If it is too large, the Drainer replication of data is affected. - -* Solution: - - - Check the TiDB cluster state. - - Check the Drainer log or monitor. If a DDL operation causes this problem, you can ignore it. diff --git a/tidb-binlog/tidb-binlog-configuration-file.md b/tidb-binlog/tidb-binlog-configuration-file.md deleted file mode 100644 index d825012ca9b4c..0000000000000 --- a/tidb-binlog/tidb-binlog-configuration-file.md +++ /dev/null @@ -1,353 +0,0 @@ ---- -title: TiDB Binlog Configuration File -summary: Learn the configuration items of TiDB Binlog. -aliases: ['/docs/dev/tidb-binlog/tidb-binlog-configuration-file/','/docs/dev/reference/tidb-binlog/config/'] ---- - -# TiDB Binlog Configuration File - -This document introduces the configuration items of TiDB Binlog. - -## Pump - -This section introduces the configuration items of Pump. For the example of a complete Pump configuration file, see [Pump Configuration](https://github.com/pingcap/tidb-binlog/blob/master/cmd/pump/pump.toml). - -### addr - -* Specifies the listening address of HTTP API in the format of `host:port`. -* Default value: `127.0.0.1:8250` - -### advertise-addr - -* Specifies the externally accessible HTTP API address. This address is registered in PD in the format of `host:port`. -* Default value: `127.0.0.1:8250` - -### socket - -* The Unix socket address that HTTP API listens to. -* Default value: "" - -### pd-urls - -* Specifies the comma-separated list of PD URLs. If multiple addresses are specified, when the PD client fails to connect to one address, it automatically tries to connect to another address. -* Default value: `http://127.0.0.1:2379` - -### data-dir - -* Specifies the directory where binlogs and their indexes are stored locally. -* Default value: `data.pump` - -### heartbeat-interval - -* Specifies the heartbeat interval (in seconds) at which the latest status is reported to PD. -* Default value: `2` - -### gen-binlog-interval - -* Specifies the interval (in seconds) at which data is written into fake binlog. -* Default value: `3` - -### gc - -* Specifies the number of days (integer) that binlogs can be stored locally. Binlogs stored longer than the specified number of days are automatically deleted. -* Default value: `7` - -### log-file - -* Specifies the path where log files are stored. If the parameter is set to an empty value, log files are not stored. -* Default value: "" - -### log-level - -* Specifies the log level. -* Default value: `info` - -### node-id - -* Specifies the Pump node ID. With this ID, this Pump process can be identified in the cluster. -* Default value: `hostname:port number`. For example, `node-1:8250`. - -### security - -This section introduces configuration items related to security. - -#### ssl-ca - -* Specifies the file path of the trusted SSL certificate list or CA list. For example, `/path/to/ca.pem`. -* Default value: "" - -#### ssl-cert - -* Specifies the path of the X509 certificate file encoded in the Privacy Enhanced Mail (PEM) format. For example, `/path/to/pump.pem`. -* Default value: "" - -#### ssl-key - -* Specifies the path of the X509 key file encoded in the PEM format. For example, `/path/to/pump-key.pem`. -* Default value: "" - -### storage - -This section introduces configuration items related to storage. - -#### sync-log - -* Specifies whether to use `fsync` after each **batch** write to binlog to ensure data safety. -* Default value: `true` - -#### kv_chan_cap - -* Specifies the number of write requests that the buffer can store before Pump receives these requests. -* Default value: `1048576` (that is, 2 to the power of 20) - -#### slow_write_threshold - -* The threshold (in seconds). If it takes longer to write a single binlog file than this specified threshold, the write is considered slow write and `"take a long time to write binlog"` is output in the log. -* Default value: `1` - -#### stop-write-at-available-space - -* Binlog write requests is no longer accepted when the available storage space is below this specified value. You can use the format such as `900 MB`, `5 GB`, and `12 GiB` to specify the storage space. If there is more than one Pump node in the cluster, when a Pump node refuses a write request because of the insufficient space, TiDB will automatically write binlogs to other Pump nodes. -* Default value: `10 GiB` - -#### kv - -Currently the storage of Pump is implemented based on [GoLevelDB](https://github.com/syndtr/goleveldb). Under `storage` there is also a `kv` subgroup that is used to adjust the GoLevel configuration. The supported configuration items are shown as below: - -* block-cache-capacity -* block-restart-interval -* block-size -* compaction-L0-trigger -* compaction-table-size -* compaction-total-size -* compaction-total-size-multiplier -* write-buffer -* write-L0-pause-trigger -* write-L0-slowdown-trigger - -For the detailed description of the above items, see [GoLevelDB Document](https://godoc.org/github.com/syndtr/goleveldb/leveldb/opt#Options). - -## Drainer - -This section introduces the configuration items of Drainer. For the example of a complete Drainer configuration file, see [Drainer Configuration](https://github.com/pingcap/tidb-binlog/blob/master/cmd/drainer/drainer.toml) - -### addr - -* Specifies the listening address of HTTP API in the format of `host:port`. -* Default value: `127.0.0.1:8249` - -### advertise-addr - -* Specifies the externally accessible HTTP API address. This address is registered in PD in the format of `host:port`. -* Default value: `127.0.0.1:8249` - -### log-file - -* Specifies the path where log files are stored. If the parameter is set to an empty value, log files are not stored. -* Default value: "" - -### log-level - -* Specifies the log level. -* Default value: `info` - -### node-id - -* Specifies the Drainer node ID. With this ID, this Drainer process can be identified in the cluster. -* Default value: `hostname:port number`. For example, `node-1:8249`. - -### data-dir - -* Specifies the directory used to store files that need to be saved during Drainer operation. -* Default value: `data.drainer` - -### detect-interval - -* Specifies the interval (in seconds) at which PD updates the Pump information. -* Default value: `5` - -### pd-urls - -* The comma-separated list of PD URLs. If multiple addresses are specified, the PD client will automatically attempt to connect to another address if an error occurs when connecting to one address. -* Default value: `http://127.0.0.1:2379` - -### initial-commit-ts - -* Specifies from which commit timestamp of the transaction the replication process starts. This configuration is applicable only to the Drainer node that is in the replication process for the first time. If a checkpoint already exists in the downstream, the replication will be performed according to the time recorded in the checkpoint. -* commit ts (commit timestamp) is a specific point in time for [transaction](/transaction-overview.md#transactions) commits in TiDB. It is a globally unique and increasing timestamp from PD as the unique ID of the current transaction. You can get the `initial-commit-ts` configuration in the following typical ways: - - If BR is used, you can get `initial-commit-ts` from the backup TS recorded in the metadata backed up by BR (backupmeta). - - If Dumpling is used, you can get `initial-commit-ts` from the Pos recorded in the metadata backed up by Dumpling (metadata), - - If PD Control is used, `initial-commit-ts` is in the output of the `tso` command. -* Default value: `-1`. Drainer will get a new timestamp from PD as the starting time, which means that the replication process starts from the current time. - -### synced-check-time - -* You can access the `/status` path through the HTTP API to query the status of Drainer replication. `synced-check-time` specifies how many minutes from the last successful replication is considered as `synced`, that is, the replication is complete. -* Default value: `5` - -### compressor - -* Specifies the compression algorithm used for data transfer between Pump and Drainer. Currently only the `gzip` algorithm is supported. -* Default value: "", which means no compression. - -### security - -This section introduces configuration items related to security. - -#### ssl-ca - -* Specifies the file path of the trusted SSL certificate list or CA list. For example, `/path/to/ca.pem`. -* Default value: "" - -#### ssl-cert - -* Specifies the path of the X509 certificate file encoded in the PEM format. For example, `/path/to/drainer.pem`. -* Default value: "" - -#### ssl-key - -* Specifies the path of the X509 key file encoded in the PEM format. For example, `/path/to/pump-key.pem`. -* Default value: "" - -### syncer - -The `syncer` section includes configuration items related to the downstream. - -#### db-type - -Currently, the following downstream types are supported: - -* `mysql` -* `tidb` -* `kafka` -* `file` - -Default value: `mysql` - -#### sql-mode - -* Specifies the SQL mode when the downstream is the `mysql` or `tidb` type. If there is more than one mode, use commas to separate them. -* Default value: "" - -#### ignore-txn-commit-ts - -* Specifies the commit timestamp at which the binlog is ignored, such as `[416815754209656834, 421349811963822081]`. -* Default value: `[]` - -#### ignore-schemas - -* Specifies the database to be ignored during replication. If there is more than one database to be ignored, use commas to separate them. If all changes in a binlog file are filtered, the whole binlog file is ignored. -* Default value: `INFORMATION_SCHEMA,PERFORMANCE_SCHEMA,mysql` - -#### ignore-table - -Ignores the specified table changes during replication. You can specify multiple tables to be ignored in the `toml` file. For example: - -{{< copyable "" >}} - -```toml -[[syncer.ignore-table]] -db-name = "test" -tbl-name = "log" - -[[syncer.ignore-table]] -db-name = "test" -tbl-name = "audit" -``` - -If all changes in a binlog file are filtered, the whole binlog file is ignored. - -Default value: `[]` - -#### replicate-do-db - -* Specifies the database to be replicated. For example, `[db1, db2]`. -* Default value: `[]` - -#### replicate-do-table - -Specifies the table to be replicated. For example: - -{{< copyable "" >}} - -```toml -[[syncer.replicate-do-table]] -db-name ="test" -tbl-name = "log" - -[[syncer.replicate-do-table]] -db-name ="test" -tbl-name = "~^a.*" -``` - -Default value: `[]` - -#### txn-batch - -* When the downstream is the `mysql` or `tidb` type, DML operations are executed in different batches. This parameter specifies how many DML operations can be included in each transaction. -* Default value: `20` - -#### worker-count - -* When the downstream is the `mysql` or `tidb` type, DML operations are executed concurrently. This parameter specifies the concurrency numbers of DML operations. -* Default value: `16` - -#### disable-dispatch - -* Disables the concurrency and forcibly set `worker-count` to `1`. -* Default value: `false` - -#### safe-mode - -If the safe mode is enabled, Drainer modifies the replication updates in the following way: - -* `Insert` is modified to `Replace Into` -* `Update` is modified to `Delete` plus `Replace Into` - -Default value: `false` - -### syncer.to - -The `syncer.to` section introduces different types of downstream configuration items according to configuration types. - -#### mysql/tidb - -The following configuration items are related to connection to downstream databases: - -* `host`: If this item is not set, TiDB Binlog tries to check the `MYSQL_HOST` environment variable which is `localhost` by default. -* `port`: If this item is not set, TiDB Binlog tries to check the `MYSQL_PORT` environment variable which is `3306` by default. -* `user`: If this item is not set, TiDB Binlog tries to check the `MYSQL_USER` environment variable which is `root` by default. -* `password`: If this item is not set, TiDB Binlog tries to check the `MYSQL_PSWD` environment variable which is `""` by default. -* `read-timeout`: Specifies the I/O read timeout of the downstream database connection. The default value is `1m`. If Drainer keeps failing on some DDLs that take a long time, you can set this configuration to a larger value. - -#### file - -* `dir`: Specifies the directory where binlog files are stored. If this item is not set, `data-dir` is used. - -#### kafka - -When the downstream is Kafka, the valid configuration items are as follows: - -* `zookeeper-addrs` -* `kafka-addrs` -* `kafka-version` -* `kafka-max-messages` -* `kafka-max-message-size` -* `topic-name` - -### syncer.to.checkpoint - -* `type`: Specifies in what way the replication progress is saved. Currently, the available options are `mysql`, `tidb`, and `file`. - - This configuration item is the same as the downstream type by default. For example, when the downstream is `file`, the checkpoint progress is saved in the local file `/savepoint`; when the downstream is `mysql`, the progress is saved in the downstream database. If you need to explicitly specify using `mysql` or `tidb` to store the progress, make the following configuration: - -* `schema`: `"tidb_binlog"` by default. - - > **Note:** - > - > When deploying multiple Drainer nodes in the same TiDB cluster, you need to specify a different checkpoint schema for each node. Otherwise, the replication progress of two instances will overwrite each other. - -* `host` -* `user` -* `password` -* `port` \ No newline at end of file diff --git a/tidb-binlog/tidb-binlog-faq.md b/tidb-binlog/tidb-binlog-faq.md deleted file mode 100644 index 5218386ca2e54..0000000000000 --- a/tidb-binlog/tidb-binlog-faq.md +++ /dev/null @@ -1,273 +0,0 @@ ---- -title: TiDB Binlog FAQs -summary: Learn about the frequently asked questions (FAQs) and answers about TiDB Binlog. -aliases: ['/docs/dev/tidb-binlog/tidb-binlog-faq/','/docs/dev/reference/tidb-binlog/faq/','/docs/dev/reference/tools/tidb-binlog/faq/'] ---- - -# TiDB Binlog FAQs - -This document collects the frequently asked questions (FAQs) about TiDB Binlog. - -## What is the impact of enabling TiDB Binlog on the performance of TiDB? - -- There is no impact on the query. - -- There is a slight performance impact on `INSERT`, `DELETE` and `UPDATE` transactions. In latency, a p-binlog is written concurrently in the TiKV prewrite stage before the transactions are committed. Generally, writing binlog is faster than TiKV prewrite, so it does not increase latency. You can check the response time of writing binlog in Pump's monitoring panel. - -## How high is the replication latency of TiDB Binlog? - -The latency of TiDB Binlog replication is measured in seconds, which is generally about 3 seconds during off-peak hours. - -## What privileges does Drainer need to replicate data to the downstream MySQL or TiDB cluster? - -To replicate data to the downstream MySQL or TiDB cluster, Drainer must have the following privileges: - -* Insert -* Update -* Delete -* Create -* Drop -* Alter -* Execute -* Index -* Select -* Create View - -## What can I do if the Pump disk is almost full? - -1. Check whether Pump's GC works well: - - - Check whether the **gc_tso** time in Pump's monitoring panel is identical with that of the configuration file. - -2. If GC works well, perform the following steps to reduce the amount of space required for a single Pump: - - - Modify the **GC** parameter of Pump to reduce the number of days to retain data. - - - Add pump instances. - -## What can I do if Drainer replication is interrupted? - -Execute the following command to check whether the status of Pump is normal and whether all the Pump instances that are not in the `offline` state are running. - -{{< copyable "shell-regular" >}} - -```bash -binlogctl -cmd pumps -``` - -Then, check whether the Drainer monitor or log outputs corresponding errors. If so, resolve them accordingly. - -## What can I do if Drainer is slow to replicate data to the downstream MySQL or TiDB cluster? - -Check the following monitoring items: - -- For the **Drainer Event** monitoring metric, check the speed of Drainer replicating `INSERT`, `UPDATE` and `DELETE` transactions to the downstream per second. - -- For the **SQL Query Time** monitoring metric, check the time Drainer takes to execute SQL statements in the downstream. - -Possible causes and solutions for slow replication: - -- If the replicated database contains a table without a primary key or unique index, add a primary key to the table. - -- If the latency between Drainer and the downstream is high, increase the value of the `worker-count` parameter of Drainer. For cross-datacenter replication, it is recommended to deploy Drainer in the downstream. - -- If the load in the downstream is not high, increase the value of the `worker-count` parameter of Drainer. - -## What can I do if a Pump instance crashes? - -If a Pump instance crashes, Drainer cannot replicate data to the downstream because it cannot obtain the data of this instance. If this Pump instance can recover to the normal state, Drainer resumes replication; if not, perform the following steps: - -1. Use [binlogctl to change the state of this Pump instance to `offline`](/tidb-binlog/maintain-tidb-binlog-cluster.md) to discard the data of this Pump instance. - -2. Because Drainer cannot obtain the data of this pump instance, the data in the downstream and upstream is inconsistent. In this situation, perform full and incremental backups again. The steps are as follows: - - 1. Stop the Drainer. - - 2. Perform a full backup in the upstream. - - 3. Clear the data in the downstream including the `tidb_binlog.checkpoint` table. - - 4. Restore the full backup to the downstream. - - 5. Deploy Drainer and use `initialCommitTs` (set `initialCommitTs` as the snapshot timestamp of the full backup) as the start point of initial replication. - -## What is checkpoint? - -Checkpoint records the `commit-ts` that Drainer replicates to the downstream. When Drainer restarts, it reads the checkpoint and then replicates data to the downstream starting from the corresponding `commit-ts`. The `["write save point"] [ts=411222863322546177]` Drainer log means saving the checkpoint with the corresponding timestamp. - -Checkpoint is saved in different ways for different types of downstream platforms: - -- For MySQL/TiDB, it is saved in the `tidb_binlog.checkpoint` table. - -- For Kafka/file, it is saved in the file of the corresponding configuration directory. - -The data of kafka/file contains `commit-ts`, so if the checkpoint is lost, you can check the latest `commit-ts` of the downstream data by consuming the latest data in the downstream . - -Drainer reads the checkpoint when it starts. If Drainer cannot read the checkpoint, it uses the configured `initialCommitTs` as the start point of the initial replication. - -## How to redeploy Drainer on the new machine when Drainer fails and the data in the downstream remains? - -If the data in the downstream is not affected, you can redeploy Drainer on the new machine as long as the data can be replicated from the corresponding checkpoint. - -- If the checkpoint is not lost, perform the following steps: - - 1. Deploy and start a new Drainer (Drainer can read checkpoint and resumes replication). - - 2. Use [binlogctl to change the state of the old Drainer to `offline`](/tidb-binlog/maintain-tidb-binlog-cluster.md). - -- If the checkpoint is lost, perform the following steps: - - 1. To deploy a new Drainer, obtain the `commit-ts` of the old Drainer as the `initialCommitTs` of the new Drainer. - - 2. Use [binlogctl to change the state of the old Drainer to `offline`](/tidb-binlog/maintain-tidb-binlog-cluster.md). - -## How to restore the data of a cluster using a full backup and a binlog backup file? - -1. Clean up the cluster and restore a full backup. - -2. To restore the latest data of the backup file, use Reparo to set `start-tso` = {snapshot timestamp of the full backup + 1} and `end-ts` = 0 (or you can specify a point in time). - -## How to redeploy Drainer when enabling `ignore-error` in Primary-Secondary replication triggers a critical error? - -If a critical error is triggered when TiDB fails to write binlog after enabling `ignore-error`, TiDB stops writing binlog and binlog data loss occurs. To resume replication, perform the following steps: - -1. Stop the Drainer instance. - -2. Restart the `tidb-server` instance that triggers critical error and resume writing binlog (TiDB does not write binlog to Pump after critical error is triggered). - -3. Perform a full backup in the upstream. - -4. Clear the data in the downstream including the `tidb_binlog.checkpoint` table. - -5. Restore the full backup to the downstream. - -6. Deploy Drainer and use `initialCommitTs` (set `initialCommitTs` as the snapshot timestamp of the full backup) as the start point of initial replication. - -## When can I pause or close a Pump or Drainer node? - -Refer to [TiDB Binlog Cluster Operations](/tidb-binlog/maintain-tidb-binlog-cluster.md) to learn the description of the Pump or Drainer state and how to start and exit the process. - -Pause a Pump or Drainer node when you need to temporarily stop the service. For example: - -- Version upgrade - - Use the new binary to restart the service after the process is stopped. - -- Server maintenance - - When the server needs a downtime maintenance, exit the process and restart the service after the maintenance is finished. - -Close a Pump or Drainer node when you no longer need the service. For example: - -- Pump scale-in - - If you do not need too many Pump services, close some of them. - -- Cancelling replication tasks - - If you no longer need to replicate data to a downstream database, close the corresponding Drainer node. - -- Service migration - - If you need to migrate the service to another server, close the service and re-deploy it on the new server. - -## How can I pause a Pump or Drainer process? - -- Directly kill the process. - - > **Note:** - > - > Do not use the `kill -9` command. Otherwise, the Pump or Drainer node cannot process signals. - -- If the Pump or Drainer node runs in the foreground, pause it by pressing Ctrl+C. -- Use the `pause-pump` or `pause-drainer` command in binlogctl. - -## Can I use the `update-pump` or `update-drainer` command in binlogctl to pause the Pump or Drainer service? - -No. The `update-pump` or `update-drainer` command directly modifies the state information saved in PD without notifying Pump or Drainer to perform the corresponding operation. Misusing the two commands can interrupt data replication and might even cause data loss. - -## Can I use the `update-pump` or `update-drainer` command in binlogctl to close the Pump or Drainer service? - -No. The `update-pump` or `update-drainer` command directly modifies the state information saved in PD without notifying Pump or Drainer to perform the corresponding operation. Misusing the two commands interrupts data replication and might even cause data inconsistency. For example: - -- When a Pump node runs normally or is in the `paused` state, if you use the `update-pump` command to set the Pump state to `offline`, the Drainer node stops pulling the binlog data from the `offline` Pump. In this situation, the newest binlog cannot be replicated to the Drainer node, causing data inconsistency between upstream and downstream. -- When a Drainer node runs normally, if you use the `update-drainer` command to set the Drainer state to `offline`, the newly started Pump node only notifies Drainer nodes in the `online` state. In this situation, the `offline` Drainer fails to pull the binlog data from the Pump node in time, causing data inconsistency between upstream and downstream. - -## When can I use the `update-pump` command in binlogctl to set the Pump state to `paused`? - -In some abnormal situations, Pump fails to correctly maintain its state. Then, use the `update-pump` command to modify the state. - -For example, when a Pump process is exited abnormally (caused by directly exiting the process when a panic occurs or mistakenly using the `kill -9` command to kill the process), the Pump state information saved in PD is still `online`. In this situation, if you do not need to restart Pump to recover the service at the moment, use the `update-pump` command to update the Pump state to `paused`. Then, interruptions can be avoided when TiDB writes binlogs and Drainer pulls binlogs. - -## When can I use the `update-drainer` command in binlogctl to set the Drainer state to `paused`? - -In some abnormal situations, the Drainer node fails to correctly maintain its state, which has influenced the replication task. Then, use the `update-drainer` command to modify the state. - -For example, when a Drainer process is exited abnormally (caused by directly exiting the process when a panic occurs or mistakenly using the `kill -9` command to kill the process), the Drainer state information saved in PD is still `online`. When a Pump node is started, it fails to notify the exited Drainer node (the `notify drainer ...` error), which cause the Pump node failure. In this situation, use the `update-drainer` command to update the Drainer state to `paused` and restart the Pump node. - -## How can I close a Pump or Drainer node? - -Currently, you can only use the `offline-pump` or `offline-drainer` command in binlogctl to close a Pump or Drainer node. - -## When can I use the `update-pump` command in binlogctl to set the Pump state to `offline`? - -You can use the `update-pump` command to set the Pump state to `offline` in the following situations: - -- When a Pump process is exited abnormally and the service cannot be recovered, the replication task is interrupted. If you want to recover the replication and accept some losses of binlog data, use the `update-pump` command to set the Pump state to `offline`. Then, the Drainer node stops pulling binlog from the Pump node and continues replicating data. -- Some stale Pump nodes are left over from historical tasks. Their processes have been exited and their services are no longer needed. Then, use the `update-pump` command to set their state to `offline`. - -For other situations, use the `offline-pump` command to close the Pump service, which is the regular process. - -> **Warning:** -> -> > Do not use the `update-pump` command unless you can tolerate binlog data loss and data inconsistency between upstream and downstream, or you no longer need the binlog data stored in the Pump node. - -## Can I use the `update-pump` command in binlogctl to set the Pump state to `offline` if I want to close a Pump node that is exited and set to `paused`? - -When a Pump process is exited and the node is in the `paused` state, not all the binlog data in the node is consumed in its downstream Drainer node. Therefore, doing so might risk data inconsistency between upstream and downstream. In this situation, restart the Pump and use the `offline-pump` command to close the Pump node. - -## When can I use the `update-drainer` command in binlogctl to set the Drainer state to `offline`? - -Some stale Drainer nodes are left over from historical tasks. Their processes have been exited and their services are no longer needed. Then, use the `update-drainer` command to set their state to `offline`. - -## Can I use SQL operations such as `change pump` and `change drainer` to pause or close the Pump or Drainer service? - -No. For more details on these SQL operations, refer to [Use SQL statements to manage Pump or Drainer](/tidb-binlog/maintain-tidb-binlog-cluster.md#use-sql-statements-to-manage-pump-or-drainer). - -These SQL operations directly modifies the state information saved in PD and are functionally equivalent to the `update-pump` and `update-drainer` commands in binlogctl. To pause or close the Pump or Drainer service, use the binlogctl tool. - -## What can I do when some DDL statements supported by the upstream database cause error when executed in the downstream database? - -To solve the problem, follow these steps: - -1. Check `drainer.log`. Search `exec failed` for the last failed DDL operation before the Drainer process is exited. -2. Change the DDL version to the one compatible to the downstream. Perform this step manually in the downstream database. -3. Check `drainer.log`. Search for the failed DDL operation and find the `commit-ts` of this operation. For example: - - ``` - [2020/05/21 09:51:58.019 +08:00] [INFO] [syncer.go:398] ["add ddl item to syncer, you can add this commit ts to `ignore-txn-commit-ts` to skip this ddl if needed"] [sql="ALTER TABLE `test` ADD INDEX (`index1`)"] ["commit ts"=416815754209656834]. - ``` - -4. Modify the `drainer.toml` configuration file. Add the `commit-ts` in the `ignore-txn-commit-ts` item and restart the Drainer node. - -## TiDB fails to write to binlog and gets stuck, and `listener stopped, waiting for manual stop` appears in the log - -In TiDB v3.0.12 and earlier versions, the binlog write failure causes TiDB to report the fatal error. TiDB does not automatically exit but only stops the service, which seems like getting stuck. You can see the `listener stopped, waiting for manual stop` error in the log. - -You need to determine the specific causes of the binlog write failure. If the failure occurs because binlog is slowly written into the downstream, you can consider scaling out Pump or increasing the timeout time for writing binlog. - -Since v3.0.13, the error-reporting logic is optimized. The binlog write failure causes transaction execution to fail and TiDB Binlog will return an error but will not get TiDB stuck. - -## TiDB writes duplicate binlogs to Pump - -This issue does not affect the downstream and replication logic. - -When the binlog write fails or becomes timeout, TiDB retries writing binlog to the next available Pump node until the write succeeds. Therefore, if the binlog write to a Pump node is slow and causes TiDB timeout (default 15s), then TiDB determines that the write fails and tries to write to the next Pump node. If binlog is actually successfully written to the timeout-causing Pump node, the same binlog is written to multiple Pump nodes. When Drainer processes the binlog, it automatically de-duplicates binlogs with the same TSO, so this duplicate write does not affect the downstream and replication logic. - -## Reparo is interrupted during the full and incremental restore process. Can I use the last TSO in the log to resume replication? - -Yes. Reparo does not automatically enable the safe-mode when you start it. You need to perform the following steps manually: - -1. After Reparo is interrupted, record the last TSO in the log as `checkpoint-tso`. -2. Modify the Reparo configuration file, set the configuration item `start-tso` to `checkpoint-tso + 1`, set `stop-tso` to `checkpoint-tso + 80,000,000,000` (approximately five minutes after the `checkpoint-tso`), and set `safe-mode` to `true`. Start Reparo, and Reparo replicates data to `stop-tso` and then stops automatically. -3. After Reparo stops automatically, set `start-tso` to `checkpoint tso + 80,000,000,001`, set `stop-tso` to `0`, and set `safe-mode` to `false`. Start Reparo to resume replication. diff --git a/tidb-binlog/tidb-binlog-glossary.md b/tidb-binlog/tidb-binlog-glossary.md deleted file mode 100644 index 4c9ae8b8d25e4..0000000000000 --- a/tidb-binlog/tidb-binlog-glossary.md +++ /dev/null @@ -1,27 +0,0 @@ ---- -title: TiDB Binlog Glossary -summary: Learn the terms used in TiDB Binlog. -aliases: ['/docs/dev/tidb-binlog/tidb-binlog-glossary/','/docs/dev/reference/tidb-binlog/glossary/'] ---- - -# TiDB Binlog Glossary - -This document lists the terms used in the logs, monitoring, configurations, and documentation of TiDB Binlog. - -## Binlog - -In TiDB Binlog, binlogs refer to the binary log data from TiDB. They also refer to the binary log data that Drainer writes to Kafka or files. The former and the latter are in different formats. In addition, binlogs in TiDB and binlogs in MySQL are also in different formats. - -## Binlog event - -The DML binlogs from TiDB have three types of event: `INSERT`, `UPDATE`, and `DELETE`. In the monitoring dashboard of Drainer, you can see the number of different events that correspond to the replication data. - -## Checkpoint - -A checkpoint indicates the position from which a replication task is paused and resumed, or is stopped and restarted. It records the commit-ts that Drainer replicates to the downstream. When restarted, Drainer reads the checkpoint and starts replicating data from the corresponding commit-ts. - -## Safe mode - -Safe mode refers to the mode that supports the idempotent import of DML when a primary key or unique index exists in the table schema in the incremental replication task. - -In this mode, the `INSERT` statement is re-written as `REPLACE`, and the `UPDATE` statement is re-written as `DELETE` and `REPLACE`. Then the re-written statement is executed to the downstream. Safe mode is automatically enabled within 5 minutes after Drainer is started. You can manually enable the mode by modifying the `safe-mode` parameter in the configuration file, but this configuration is valid only when the downstream is MySQL or TiDB. diff --git a/tidb-binlog/tidb-binlog-overview.md b/tidb-binlog/tidb-binlog-overview.md deleted file mode 100644 index f6909668c4ad6..0000000000000 --- a/tidb-binlog/tidb-binlog-overview.md +++ /dev/null @@ -1,76 +0,0 @@ ---- -title: TiDB Binlog Overview -summary: Learn overview of the cluster version of TiDB Binlog. -aliases: ['/docs/dev/tidb-binlog/tidb-binlog-overview/','/docs/dev/reference/tidb-binlog/overview/','/docs/dev/reference/tidb-binlog-overview/','/docs/dev/reference/tools/tidb-binlog/overview/'] ---- - -# TiDB Binlog Cluster Overview - -TiDB Binlog is a tool used to collect binlog data from TiDB and provide near real-time backup and replication to downstream platforms. This document introduces the architecture and the deployment of the cluster version of TiDB Binlog. - -> **Warning:** -> -> - Starting from v7.5.0, TiDB Binlog replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). -> - TiDB Binlog is not compatible with some features introduced in TiDB v5.0 and they cannot be used together. For details, see [Notes](#notes). - -TiDB Binlog has the following features: - -* **Data replication:** replicate the data in the TiDB cluster to other databases -* **Real-time backup and restoration:** back up the data in the TiDB cluster and restore the TiDB cluster when the cluster fails - -## TiDB Binlog architecture - -The TiDB Binlog architecture is as follows: - -![TiDB Binlog architecture](/media/tidb-binlog-cluster-architecture.png) - -The TiDB Binlog cluster is composed of Pump and Drainer. - -### Pump - -[Pump](https://github.com/pingcap/tidb-binlog/blob/master/pump) is used to record the binlogs generated in TiDB, sort the binlogs based on the commit time of the transaction, and send binlogs to Drainer for consumption. - -### Drainer - -[Drainer](https://github.com/pingcap/tidb-binlog/tree/master/drainer) collects and merges binlogs from each Pump, converts the binlog to SQL or data of a specific format, and replicates the data to a specific downstream platform. - -### `binlogctl` guide - -[`binlogctl`](https://github.com/pingcap/tidb-binlog/tree/master/binlogctl) is an operations tool for TiDB Binlog with the following features: - -* Obtaining the current `tso` of TiDB cluster -* Checking the Pump/Drainer state -* Modifying the Pump/Drainer state -* Pausing or closing Pump/Drainer - -## Main features - -* Multiple Pumps form a cluster which can scale out horizontally -* TiDB uses the built-in Pump Client to send the binlog to each Pump -* Pump stores binlogs and sends the binlogs to Drainer in order -* Drainer reads binlogs of each Pump, merges and sorts the binlogs, and sends the binlogs downstream -* Drainer supports [relay log](/tidb-binlog/tidb-binlog-relay-log.md). By the relay log, Drainer ensures that the downstream clusters are in a consistent state. - -## Notes - -* In v5.1, the incompatibility between the clustered index feature introduced in v5.0 and TiDB Binlog has been resolved. After you upgrade TiDB Binlog and TiDB Server to v5.1 and enable TiDB Binlog, TiDB will support creating tables with clustered indexes; data insertion, deletion, and update on the created tables with clustered indexes will be replicated to the downstream via TiDB Binlog. When you use TiDB Binlog to replicate the tables with clustered indexes, pay attention to the following: - - - If you have upgraded the cluster to v5.1 from v5.0 by manually controlling the upgrade sequence, make sure that TiDB binlog is upgraded to v5.1 before upgrading the TiDB server to v5.1. - - It is recommended to configure the system variable [`tidb_enable_clustered_index`](/system-variables.md#tidb_enable_clustered_index-new-in-v50) to a same value to ensure that the structure of TiDB clustered index tables between the upstream and downstream is consistent. - -* TiDB Binlog is incompatible with the following features introduced in TiDB v5.0 and they cannot be used together. - - - [TiDB Clustered Index](/clustered-indexes.md#limitations): After TiDB Binlog is enabled, TiDB does not allow creating clustered indexes with non-single integer columns as primary keys; data insertion, deletion, and update of the created clustered index tables will not be replicated downstream via TiDB Binlog. If you need to replicate tables with clustered indexes, upgrade your cluster to v5.1 or use [TiCDC](/ticdc/ticdc-overview.md) instead. - - TiDB system variable [tidb_enable_async_commit](/system-variables.md#tidb_enable_async_commit-new-in-v50): After TiDB Binlog is enabled, performance cannot be improved by enabling this option. It is recommended to use [TiCDC](/ticdc/ticdc-overview.md) instead of TiDB Binlog. - - TiDB system variable [tidb_enable_1pc](/system-variables.md#tidb_enable_1pc-new-in-v50): After TiDB Binlog is enabled, performance cannot be improved by enabling this option. It is recommended to use [TiCDC](/ticdc/ticdc-overview.md) instead of TiDB Binlog. - -* Drainer supports replicating binlogs to MySQL, TiDB, Kafka or local files. If you need to replicate binlogs to other Drainer unsupported destinations, you can set Drainer to replicate the binlog to Kafka and read the data in Kafka for customized processing according to binlog consumer protocol. See [Binlog Consumer Client User Guide](/tidb-binlog/binlog-consumer-client.md). - -* To use TiDB Binlog for recovering incremental data, set the config `db-type` to `file` (local files in the proto buffer format). Drainer converts the binlog to data in the specified [proto buffer format](https://github.com/pingcap/tidb-binlog/blob/master/proto/pb_binlog.proto) and writes the data to local files. In this way, you can use [Reparo](/tidb-binlog/tidb-binlog-reparo.md) to recover data incrementally. - - Pay attention to the value of `db-type`: - - - If your TiDB version is earlier than 2.1.9, set `db-type="pb"`. - - If your TiDB version is 2.1.9 or later, set `db-type="file"` or `db-type="pb"`. - -* If the downstream is MySQL, MariaDB, or another TiDB cluster, you can use [sync-diff-inspector](/sync-diff-inspector/sync-diff-inspector-overview.md) to verify the data after data replication. diff --git a/tidb-binlog/tidb-binlog-relay-log.md b/tidb-binlog/tidb-binlog-relay-log.md deleted file mode 100644 index 68229df5b7463..0000000000000 --- a/tidb-binlog/tidb-binlog-relay-log.md +++ /dev/null @@ -1,67 +0,0 @@ ---- -title: TiDB Binlog Relay Log -summary: Learn how to use relay log to maintain data consistency in extreme cases. -aliases: ['/docs/dev/tidb-binlog/tidb-binlog-relay-log/','/docs/dev/reference/tidb-binlog/relay-log/'] ---- - -# TiDB Binlog Relay Log - -When replicating binlogs, Drainer splits transactions from the upstream and replicates the split transactions concurrently to the downstream. - -In extreme cases where the upstream clusters are not available and Drainer exits abnormally, the downstream clusters (MySQL or TiDB) might be in the intermediate states with inconsistent data. In such cases, Drainer can use the relay log to ensure that the downstream clusters are in a consistent state. - -## Consistent state during Drainer replication - -The downstream clusters reaching a consistent state means the data of the downstream clusters are the same as the snapshot of the upstream which sets `tidb_snapshot = ts`. - -The checkpoint consistency means Drainer checkpoint saves the consistent state of replication in `consistent`. When Drainer runs, `consistent` is `false`. After Drainer exits normally, `consistent` is set to `true`. - -You can query the downstream checkpoint table as follows: - -{{< copyable "sql" >}} - -```sql -select * from tidb_binlog.checkpoint; -``` - -``` -+---------------------+----------------------------------------------------------------+ -| clusterID | checkPoint | -+---------------------+----------------------------------------------------------------+ -| 6791641053252586769 | {"consistent":false,"commitTS":414529105591271429,"ts-map":{}} | -+---------------------+----------------------------------------------------------------+ -``` - -## Implementation principles - -After Drainer enables the relay log, it first writes the binlog events to the disks and then replicates the events to the downstream clusters. - -If the upstream clusters are not available, Drainer can restore the downstream clusters to a consistent state by reading the relay log. - -> **Note:** -> -> If the relay log data is lost at the same time, this method does not work, but its incidence is very low. In addition, you can use the Network File System to ensure data safety of the relay log. - -### Trigger scenarios where Drainer consumes binlogs from the relay log - -When Drainer is started, if it fails to connect to the Placement Driver (PD) of the upstream clusters, and it detects that `consistent = false` in the checkpoint, Drainer will try to read the relay log, and restore the downstream clusters to a consistent state. After that, the Drainer process sets the checkpoint `consistent` to `true` and then exits. - -### GC mechanism of relay log - -Before data is replicated to the downstream, Drainer writes data to the relay log file. If the size of a relay log file reaches 10 MB (by default) and the binlog data of the current transaction is completely written, Drainer starts to write data to the next relay log file. After Drainer successfully replicates data to the downstream, it automatically cleans up the relay log files whose data has been replicated. The relay log into which data is currently being written will not be cleaned up. - -## Configuration - -To enable the relay log, add the following configuration in Drainer: - -{{< copyable "" >}} - -``` -[syncer.relay] -# It saves the directory of the relay log. The relay log is not enabled if the value is empty. -# The configuration only comes to effect if the downstream is TiDB or MySQL. -log-dir = "/dir/to/save/log" -# The size limit of a single relay log file (unit: byte). -# When the size of a relay log file reaches this limit, data is written to the next relay log file. -max-file-size = 10485760 -``` diff --git a/tidb-binlog/tidb-binlog-reparo.md b/tidb-binlog/tidb-binlog-reparo.md deleted file mode 100644 index c4f80a5015f5b..0000000000000 --- a/tidb-binlog/tidb-binlog-reparo.md +++ /dev/null @@ -1,134 +0,0 @@ ---- -title: Reparo User Guide -summary: Learn to use Reparo. -aliases: ['/docs/dev/tidb-binlog/tidb-binlog-reparo/','/docs/dev/reference/tidb-binlog/reparo/'] ---- - -# Reparo User Guide - -Reparo is a TiDB Binlog tool, used to recover the incremental data. To back up the incremental data, you can use Drainer of TiDB Binlog to output the binlog data in the protobuf format to files. To restore the incremental data, you can use Reparo to parse the binlog data in the files and apply the binlog in TiDB/MySQL. - -The Reparo installation package (`reparo`) is included in the TiDB Toolkit. To download the TiDB Toolkit, see [Download TiDB Tools](/download-ecosystem-tools.md). - -## Reparo usage - -### Description of command line parameters - -``` -Usage of Reparo: --L string - The level of the output information of logs - Value: "debug"/"info"/"warn"/"error"/"fatal" ("info" by default) --V Prints the version. --c int - The number of concurrencies in the downstream for the replication process (`16` by default). A higher value indicates a better throughput for the replication. --config string - The path of the configuration file - If the configuration file is specified, Reparo reads the configuration data in this file. - If the configuration data also exists in the command line parameters, Reparo uses the configuration data in the command line parameters to cover that in the configuration file. --data-dir string - The storage directory for the binlog file in the protobuf format that Drainer outputs ("data.drainer" by default) --dest-type string - The downstream service type - Value: "print"/"mysql" ("print" by default) - If it is set to "print", the data is parsed and printed to standard output while the SQL statement is not executed. - If it is set to "mysql", you need to configure the "host", "port", "user" and "password" information in the configuration file. --log-file string - The path of the log file --log-rotate string - The switch frequency of log files - Value: "hour"/"day" --start-datetime string - Specifies the time point for starting recovery. - Format: "2006-01-02 15:04:05" - If it is not set, the recovery process starts from the earliest binlog file. --stop-datetime string - Specifies the time point of finishing the recovery process. - Format: "2006-01-02 15:04:05" - If it is not set, the recovery process ends up with the last binlog file. --safe-mode bool - Specifies whether to enable safe mode. When enabled, it supports repeated replication. --txn-batch int - The number of SQL statements in a transaction that is output to the downstream database (`20` by default). -``` - -### Description of the configuration file - -```toml -# The storage directory for the binlog file in the protobuf format that Drainer outputs -data-dir = "./data.drainer" - -# The level of the output information of logs -# Value: "debug"/"info"/"warn"/"error"/"fatal" ("info" by default) -log-level = "info" - -# Uses `start-datetime` and `stop-datetime` to specify the time range in which -# the binlog files are to be recovered. -# Format: "2006-01-02 15:04:05" -# start-datetime = "" -# stop-datetime = "" - -# Correspond to `start-datetime` and `stop-datetime` respectively. -# They are used to specify the time range in which the binlog files are to be recovered. -# If `start-datetime` and `stop-datetime` are set, there is no need to set `start-tso` and `stop-tso`. -# When you perform a full recovery or resume an incremental recovery, set start-tso to tso + 1 or stop-tso + 1, respectively. -# start-tso = 0 -# stop-tso = 0 - -# The downstream service type -# Value: "print"/"mysql" ("print" by default) -# If it is set to "print", the data is parsed and printed to standard output -# while the SQL statement is not executed. -# If it is set to "mysql", you need to configure `host`, `port`, `user` and `password` in [dest-db]. -dest-type = "mysql" - -# The number of SQL statements in a transaction that is output to the downstream database (`20` by default). -txn-batch = 20 - -# The number of concurrencies in the downstream for the replication process (`16` by default). A higher value indicates a better throughput for the replication. -worker-count = 16 - -# Safe-mode configuration -# Value: "true"/"false" ("false" by default) -# If it is set to "true", Reparo splits the `UPDATE` statement into a `DELETE` statement plus a `REPLACE` statement. -safe-mode = false - -# `replicate-do-db` and `replicate-do-table` specify the database and table to be recovered. -# `replicate-do-db` has priority over `replicate-do-table`. -# You can use a regular expression for configuration. The regular expression should start with "~". -# The configuration method for `replicate-do-db` and `replicate-do-table` is -# the same with that for `replicate-do-db` and `replicate-do-table` of Drainer. -# replicate-do-db = ["~^b.*","s1"] -# [[replicate-do-table]] -# db-name ="test" -# tbl-name = "log" -# [[replicate-do-table]] -# db-name ="test" -# tbl-name = "~^a.*" - -# If `dest-type` is set to `mysql`, `dest-db` needs to be configured. -[dest-db] -host = "127.0.0.1" -port = 3309 -user = "root" -password = "" -``` - -### Start example - -``` -./reparo -config reparo.toml -``` - -> **Note:** -> -> * `data-dir` specifies the directory for the binlog file that Drainer outputs. -> * Both `start-datatime` and `start-tso` are used to specify the time point for starting recovery, but they are different in the time format. If they are not set, the recovery process starts from the earliest binlog file by default. -> * Both `stop-datetime` and `stop-tso` are used to specify the time point for finishing recovery, but they are different in the time format. If they are not set, the recovery process ends up with the last binlog file by default. -> * `dest-type` specifies the destination type. Its value can be "mysql" and "print." -> -> * When it is set to `mysql`, the data can be recovered to MySQL or TiDB that uses or is compatible with the MySQL protocol. In this case, you need to specify the database information in `[dest-db]` of the configuration information. -> * When it is set to `print`, only the binlog information is printed. It is generally used for debugging and checking the binlog information. In this case, there is no need to specify `[dest-db]`. -> -> * `replicate-do-db` specifies the database for recovery. If it is not set, all the databases are to be recovered. -> * `replicate-do-table` specifies the table for recovery. If it is not set, all the tables are to be recovered. diff --git a/tidb-binlog/troubleshoot-tidb-binlog.md b/tidb-binlog/troubleshoot-tidb-binlog.md deleted file mode 100644 index decd1c675c2c3..0000000000000 --- a/tidb-binlog/troubleshoot-tidb-binlog.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -title: TiDB Binlog Troubleshooting -summary: Learn the troubleshooting process of TiDB Binlog. -aliases: ['/docs/dev/tidb-binlog/troubleshoot-tidb-binlog/','/docs/dev/reference/tidb-binlog/troubleshoot/binlog/','/docs/dev/how-to/troubleshoot/tidb-binlog/'] ---- - -# TiDB Binlog Troubleshooting - -This document describes how to troubleshoot TiDB Binlog to find the problem. - -If you encounter errors while running TiDB Binlog, take the following steps to troubleshoot: - -1. Check whether each monitoring metric is normal or not. Refer to [TiDB Binlog Monitoring](/tidb-binlog/monitor-tidb-binlog-cluster.md) for details. - -2. Use the [binlogctl tool](/tidb-binlog/binlog-control.md) to check whether the state of each Pump or Drainer node is normal or not. - -3. Check whether `ERROR` or `WARN` exists in the Pump log or Drainer log. - -After finding out the problem by the above steps, refer to [FAQ](/tidb-binlog/tidb-binlog-faq.md) and [TiDB Binlog Error Handling](/tidb-binlog/handle-tidb-binlog-errors.md) for the solution. If you fail to find the solution or the solution provided does not help, submit an [issue](https://github.com/pingcap/tidb-binlog/issues) for help. diff --git a/tidb-binlog/upgrade-tidb-binlog.md b/tidb-binlog/upgrade-tidb-binlog.md deleted file mode 100644 index c86b385aab11e..0000000000000 --- a/tidb-binlog/upgrade-tidb-binlog.md +++ /dev/null @@ -1,75 +0,0 @@ ---- -title: Upgrade TiDB Binlog -summary: Learn how to upgrade TiDB Binlog to the latest cluster version. -aliases: ['/docs/dev/tidb-binlog/upgrade-tidb-binlog/','/docs/dev/reference/tidb-binlog/upgrade/','/docs/dev/how-to/upgrade/tidb-binlog/'] ---- - -# Upgrade TiDB Binlog - -This document introduces how to upgrade TiDB Binlog that is deployed manually to the latest [cluster](/tidb-binlog/tidb-binlog-overview.md) version. There is also a section on how to upgrade TiDB Binlog from an earlier incompatible version (Kafka/Local version) to the latest version. - -> **Warning:** -> -> - Starting from v7.5.0, [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). -> - TiDB Binlog is not compatible with some features introduced in TiDB v5.0 and they cannot be used together. For details, see [Notes](/tidb-binlog/tidb-binlog-overview.md#notes). - -## Upgrade TiDB Binlog deployed manually - -Follow the steps in this section if you deploy TiDB Binlog manually. - -### Upgrade Pump - -First, upgrade each Pump instance in the cluster one by one. This ensures that there are always Pump instances in the cluster that can receive binlogs from TiDB. The steps are as below: - -1. Replace the original file with the new version of `pump`. -2. Restart the Pump process. - -### Upgrade Drainer - -Second, upgrade the Drainer component: - -1. Replace the original file with the new version of `drainer`. -2. Restart the Drainer process. - -## Upgrade TiDB Binlog from Kafka/Local version to the cluster version - -The new TiDB versions (v2.0.8-binlog, v2.1.0-rc.5 or later) are not compatible with the Kafka version or Local version of TiDB Binlog. If TiDB is upgraded to one of the new versions, it is required to use the cluster version of TiDB Binlog. If the Kafka or local version of TiDB Binlog is used before upgrading, you need to upgrade your TiDB Binlog to the cluster version. - -The corresponding relationship between TiDB Binlog versions and TiDB versions is shown in the following table: - -| TiDB Binlog version | TiDB version | Note | -|---------------------|-------------------------------------------|--------------------------------------------------------------------------------------------| -| Local | TiDB 1.0 or earlier | | -| Kafka | TiDB 1.0 ~ TiDB 2.1 RC5 | TiDB 1.0 supports both the local and Kafka versions of TiDB Binlog. | -| Cluster | TiDB v2.0.8-binlog, TiDB 2.1 RC5 or later | TiDB v2.0.8-binlog is a special 2.0 version supporting the cluster version of TiDB Binlog. | - -### Upgrade process - -> **Note:** -> -> If importing the full data is acceptable, you can abandon the old version and deploy TiDB Binlog following [TiDB Binlog Cluster Deployment](/tidb-binlog/deploy-tidb-binlog.md). - -If you want to resume replication from the original checkpoint, perform the following steps to upgrade TiDB Binlog: - -1. Deploy the new version of Pump. -2. Stop the TiDB cluster service. -3. Upgrade TiDB and the configuration, and write the binlog data to the new Pump cluster. -4. Reconnect the TiDB cluster to the service. -5. Make sure that the old version of Drainer has replicated the data in the old version of Pump to the downstream completely; - - Query the `status` interface of Drainer, command as below: - - {{< copyable "shell-regular" >}} - - ```bash - curl 'http://172.16.10.49:8249/status' - ``` - - ``` - {"PumpPos":{"172.16.10.49:8250":{"offset":32686}},"Synced": true ,"DepositWindow":{"Upper":398907800202772481,"Lower":398907799455662081}} - ``` - - If the return value of `Synced` is True, it means Drainer has replicated the data in the old version of Pump to the downstream completely. - -6. Start the new version of Drainer. -7. Close the Pump and Drainer of the old versions and the dependent Kafka and ZooKeeper. diff --git a/tidb-configuration-file.md b/tidb-configuration-file.md index c1c47cd3874fa..a8d1f99caa43d 100644 --- a/tidb-configuration-file.md +++ b/tidb-configuration-file.md @@ -519,7 +519,7 @@ Configuration items related to performance. - The size limit of a single transaction in TiDB. - Default value: `104857600` (in bytes) -- In a single transaction, the total size of key-value records cannot exceed this value. The maximum value of this parameter is `1099511627776` (1 TB). Note that if you have used the binlog to serve the downstream consumer Kafka (such as the `arbiter` cluster), the value of this parameter must be no more than `1073741824` (1 GB). This is because 1 GB is the upper limit of a single message size that Kafka can process. Otherwise, an error is returned if this limit is exceeded. +- In a single transaction, the total size of key-value records cannot exceed this value. The maximum value of this parameter is `1099511627776` (1 TB). - In TiDB v6.5.0 and later versions, this configuration is no longer recommended. The memory size of a transaction will be accumulated into the memory usage of the session, and the [`tidb_mem_quota_query`](/system-variables.md#tidb_mem_quota_query) variable will take effect when the session memory threshold is exceeded. To be compatible with previous versions, this configuration works as follows when you upgrade from an earlier version to TiDB v6.5.0 or later: - If this configuration is not set or is set to the default value (`104857600`), after an upgrade, the memory size of a transaction will be accumulated into the memory usage of the session, and the `tidb_mem_quota_query` variable will take effect. - If this configuration is not defaulted (`104857600`), it still takes effect and its behavior on controlling the size of a single transaction remains unchanged before and after the upgrade. This means that the memory size of the transaction is not controlled by the `tidb_mem_quota_query` variable. @@ -806,37 +806,6 @@ Configuration items related to the transaction latch. These configuration items - The number of slots corresponding to Hash, which automatically adjusts upward to an exponential multiple of 2. Each slot occupies 32 Bytes of memory. If set too small, it might result in slower running speed and poor performance in the scenario where data writing covers a relatively large range (such as importing data). - Default value: `2048000` -## binlog - -Configurations related to TiDB Binlog. - -### `enable` - -- Enables or disables binlog. -- Default value: `false` - -### `write-timeout` - -- The timeout of writing binlog into Pump. It is not recommended to modify this value. -- Default: `15s` -- unit: second - -### `ignore-error` - -- Determines whether to ignore errors occurred in the process of writing binlog into Pump. It is not recommended to modify this value. -- Default value: `false` -- When the value is set to `true` and an error occurs, TiDB stops writing binlog and add `1` to the count of the `tidb_server_critical_error_total` monitoring item. When the value is set to `false`, the binlog writing fails and the entire TiDB service is stopped. - -### `binlog-socket` - -- The network address to which binlog is exported. -- Default value: "" - -### `strategy` - -- The strategy of Pump selection when binlog is exported. Currently, only the `hash` and `range` methods are supported. -- Default value: `range` - ## status Configuration related to the status of TiDB service. diff --git a/tidb-monitoring-framework.md b/tidb-monitoring-framework.md index d178b1bbb5599..ea91beabaeaa0 100644 --- a/tidb-monitoring-framework.md +++ b/tidb-monitoring-framework.md @@ -29,7 +29,6 @@ Grafana is an open source project for analyzing and visualizing metrics. TiDB us ![Grafana monitored_groups](/media/grafana-monitored-groups.png) - {TiDB_Cluster_name}-Backup-Restore: Monitoring metrics related to backup and restore. -- {TiDB_Cluster_name}-Binlog: Monitoring metrics related to TiDB Binlog. - {TiDB_Cluster_name}-Blackbox_exporter: Monitoring metrics related to network probe. - {TiDB_Cluster_name}-Disk-Performance: Monitoring metrics related to disk performance. - {TiDB_Cluster_name}-Kafka-Overview: Monitoring metrics related to Kafka. diff --git a/tidb-troubleshooting-map.md b/tidb-troubleshooting-map.md index bf5d459c29abb..85ff150f0dcda 100644 --- a/tidb-troubleshooting-map.md +++ b/tidb-troubleshooting-map.md @@ -398,101 +398,31 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** ## 6. Ecosystem tools -### 6.1 TiDB Binlog +### 6.1 Data Migration -- 6.1.1 [TiDB Binlog](/tidb-binlog/tidb-binlog-overview.md) (deprecated) is a tool that collects changes from TiDB and provides backup and replication to downstream TiDB or MySQL platforms. For details, see [TiDB Binlog on GitHub](https://github.com/pingcap/tidb-binlog). +- 6.1.1 TiDB Data Migration (DM) is a migration tool that supports data migration from MySQL/MariaDB into TiDB. For details, see [DM overview](/dm/dm-overview.md). -- 6.1.2 The `Update Time` in Pump/Drainer Status is updated normally, and no anomaly shows in the log, but no data is written to the downstream. - - - Binlog is not enabled in the TiDB configuration. Modify the `[binlog]` configuration in TiDB. - -- 6.1.3 `sarama` in Drainer reports the `EOF` error. - - - The Kafka client version in Drainer is inconsistent with the version of Kafka. You need to modify the `[syncer.to] kafka-version` configuration. - -- 6.1.4 Drainer fails to write to Kafka and panics, and Kafka reports the `Message was too large` error. - - - The binlog data is too large, so the single message written to Kafka is too large. You need to modify the following configuration of Kafka: - - ```properties - message.max.bytes=1073741824 - replica.fetch.max.bytes=1073741824 - fetch.message.max.bytes=1073741824 - ``` - - For details, see [case-789](https://github.com/pingcap/tidb-map/blob/master/maps/diagnose-case-study/case789.md) in Chinese. - -- 6.1.5 Inconsistent data in upstream and downstream - - - Some TiDB nodes do not enable binlog. For v3.0.6 or later versions, you can check the binlog status of all the nodes by accessing the interface. For versions earlier than v3.0.6, you can check the binlog status by viewing the configuration file. - - - Some TiDB nodes go into the `ignore binlog` status. For v3.0.6 or later versions, you can check the binlog status of all the nodes by accessing the interface. For versions earlier than v3.0.6, check the TiDB log to see whether it contains the `ignore binlog` keyword. - - - The value of the timestamp column is inconsistent in upstream and downstream. - - - This is caused by different time zones. You need to ensure that Drainer is in the same time zone as the upstream and downstream databases. Drainer obtains its time zone from `/etc/localtime` and does not support the `TZ` environment variable. See [case-826](https://github.com/pingcap/tidb-map/blob/master/maps/diagnose-case-study/case826.md) in Chinese. - - - In TiDB, the default value of timestamp is `null`, but the same default value in MySQL 5.7 (not including MySQL 8) is the current time. Therefore, when the timestamp in upstream TiDB is `null` and the downstream is MySQL 5.7, the data in the timestamp column is inconsistent. You need to run `set @@global.explicit_defaults_for_timestamp=on;` in the upstream before enabling binlog. - - - For other situations, [report a bug](https://github.com/pingcap/tidb-binlog/issues/new?labels=bug&template=bug-report.md). - -- 6.1.6 Slow replication - - - The downstream is TiDB/MySQL, and the upstream performs frequent DDL operations. See [case-1023](https://github.com/pingcap/tidb-map/blob/master/maps/diagnose-case-study/case1023.md) in Chinese. - - - The downstream is TiDB/MySQL, and the table to be replicated has no primary key and no unique index, which causes reduced performance in binlog. It is recommended to add the primary key or unique index. - - - If the downstream outputs to files, check whether the output disk or network disk is slow. - - - For other situations, [report a bug](https://github.com/pingcap/tidb-binlog/issues/new?labels=bug&template=bug-report.md). - -- 6.1.7 Pump cannot write binlog and reports the `no space left on device` error. - - - The local disk space is insufficient for Pump to write binlog data normally. You need to clean up the disk space and then restart Pump. - -- 6.1.8 Pump reports the `fail to notify all living drainer` error when it is started. - - - Cause: When Pump is started, it notifies all Drainer nodes that are in the `online` state. If it fails to notify Drainer, this error log is printed. - - - Solution: Use the binlogctl tool to check whether each Drainer node is normal or not. This is to ensure that all Drainer nodes in the `online` state are working normally. If the state of a Drainer node is not consistent with its actual working status, use the binlogctl tool to change its state and then restart Pump. See the case [fail-to-notify-all-living-drainer](/tidb-binlog/handle-tidb-binlog-errors.md#fail-to-notify-all-living-drainer-is-returned-when-pump-is-started). - -- 6.1.9 Drainer reports the `gen update sqls failed: table xxx: row data is corruption []` error. - - - Trigger: The upstream performs DML operations on this table while performing `DROP COLUMN` DDL. This issue has been fixed in v3.0.6. See [case-820](https://github.com/pingcap/tidb-map/blob/master/maps/diagnose-case-study/case820.md) in Chinese. - -- 6.1.10 Drainer replication is hung. The process remains active but the checkpoint is not updated. - - - This issues has been fixed in v3.0.4. See [case-741](https://github.com/pingcap/tidb-map/blob/master/maps/diagnose-case-study/case741.md) in Chinese. - -- 6.1.11 Any component panics. - - - [Report a bug](https://github.com/pingcap/tidb-binlog/issues/new?labels=bug&template=bug-report.md). - -### 6.2 Data Migration - -- 6.2.1 TiDB Data Migration (DM) is a migration tool that supports data migration from MySQL/MariaDB into TiDB. For details, see [DM overview](/dm/dm-overview.md). - -- 6.2.2 `Access denied for user 'root'@'172.31.43.27' (using password: YES)` shows when you run `query status` or check the log. +- 6.1.2 `Access denied for user 'root'@'172.31.43.27' (using password: YES)` shows when you run `query status` or check the log. - The database related passwords in all the DM configuration files should be encrypted by `dmctl`. If a database password is empty, it is unnecessary to encrypt the password. Cleartext passwords can be used since v1.0.6. - During DM operation, the user of the upstream and downstream databases must have the corresponding read and write privileges. Data Migration also [prechecks the corresponding privileges](/dm/dm-precheck.md) automatically while starting the data replication task. - To deploy different versions of DM-worker/DM-master/dmctl in a DM cluster, see the [case study on AskTUG](https://asktug.com/t/dm1-0-0-ga-access-denied-for-user/1049/5) in Chinese. -- 6.2.3 A replication task is interrupted with the `driver: bad connection` error returned. +- 6.1.3 A replication task is interrupted with the `driver: bad connection` error returned. - The `driver: bad connection` error indicates that an anomaly has occurred in the connection between DM and the downstream TiDB database (such as network failure and TiDB restart), and that the data of the current request has not yet been sent to TiDB. - For versions earlier than DM 1.0.0 GA, stop the task by running `stop-task` and then restart the task by running `start-task`. - For DM 1.0.0 GA or later versions, an automatic retry mechanism for this type of error is added. See [#265](https://github.com/pingcap/dm/pull/265). -- 6.2.4 A replication task is interrupted with the `invalid connection` error. +- 6.1.4 A replication task is interrupted with the `invalid connection` error. - The `invalid connection` error indicates that an anomaly has occurred in the connection between DM and the downstream TiDB database (such as network failure, TiDB restart, and TiKV busy), and that a part of the data for the current request has been sent to TiDB. Because DM has the feature of concurrently replicating data to the downstream in replication tasks, several errors might occur when a task is interrupted. You can check these errors by running `query-status` or `query-error`. - If only the `invalid connection` error occurs during the incremental replication process, DM retries the task automatically. - If DM does not retry or fails to retry automatically because of version problems (automatic retry is introduced in v1.0.0-rc.1), use `stop-task` to stop the task and then use `start-task` to restart the task. -- 6.2.5 The relay unit reports the error `event from * in * diff from passed-in event *`, or a replication task is interrupted with an error that fails to get or parse binlog, such as `get binlog error ERROR 1236 (HY000) and binlog checksum mismatch, data may be corrupted returned` +- 6.1.5 The relay unit reports the error `event from * in * diff from passed-in event *`, or a replication task is interrupted with an error that fails to get or parse binlog, such as `get binlog error ERROR 1236 (HY000) and binlog checksum mismatch, data may be corrupted returned` - During the process that DM pulls relay log or the incremental replication, this two errors might occur if the size of the upstream binlog file exceeds 4 GB. @@ -503,7 +433,7 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** - For relay processing units, [manually recover replication](https://pingcap.com/docs/tidb-data-migration/dev/error-handling/#the-relay-unit-throws-error-event-from--in--diff-from-passed-in-event--or-a-replication-task-is-interrupted-with-failing-to-get-or-parse-binlog-errors-like-get-binlog-error-error-1236-hy000-and-binlog-checksum-mismatch-data-may-be-corrupted-returned). - For binlog replication processing units, [manually recover replication](https://pingcap.com/docs/tidb-data-migration/dev/error-handling/#the-relay-unit-throws-error-event-from--in--diff-from-passed-in-event--or-a-replication-task-is-interrupted-with-failing-to-get-or-parse-binlog-errors-like-get-binlog-error-error-1236-hy000-and-binlog-checksum-mismatch-data-may-be-corrupted-returned). -- 6.2.6 The DM replication is interrupted, and the log returns `ERROR 1236 (HY000) The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.` +- 6.1.6 The DM replication is interrupted, and the log returns `ERROR 1236 (HY000) The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.` - Check whether the master binlog is purged. - Check the position information recorded in `relay.meta`. @@ -512,15 +442,15 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** - The binlog event recorded in `relay.meta` triggers the incomplete recover process and records the wrong GTID information. This issue is fixed in v1.0.2, and might occur in earlier versions. -- 6.2.7 The DM replication process returns an error `Error 1366: incorrect utf8 value eda0bdedb29d(\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd)`. +- 6.1.7 The DM replication process returns an error `Error 1366: incorrect utf8 value eda0bdedb29d(\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd)`. - This value cannot be successfully written into MySQL 8.0 or TiDB, but can be written into MySQL 5.7. You can skip the data format check by enabling the `tidb_skip_utf8_check` parameter. -### 6.3 TiDB Lightning +### 6.2 TiDB Lightning -- 6.3.1 TiDB Lightning is a tool for fast full import of large amounts of data into a TiDB cluster. See [TiDB Lightning on GitHub](https://github.com/pingcap/tidb/tree/master/lightning). +- 6.2.1 TiDB Lightning is a tool for fast full import of large amounts of data into a TiDB cluster. See [TiDB Lightning on GitHub](https://github.com/pingcap/tidb/tree/master/lightning). -- 6.3.2 Import speed is too slow. +- 6.2.2 Import speed is too slow. - `region-concurrency` is set too high, which causes thread contention and reduces performance. Three ways to troubleshoot: @@ -531,7 +461,7 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** - Every additional index introduces a new KV pair for each row. If there are N indices, the actual size to be imported would be approximately (N+1) times the size of the [Dumpling](/dumpling-overview.md) output. If the indices are negligible, you may first remove them from the schema, and add them back via `CREATE INDEX` after the import is complete. - The version of TiDB Lightning is old. Try the latest version, which might improve the import speed. -- 6.3.3 `checksum failed: checksum mismatched remote vs local`. +- 6.2.3 `checksum failed: checksum mismatched remote vs local`. - Cause 1: The table might already have data. These old data can affect the final checksum. @@ -544,19 +474,19 @@ Check the specific cause for busy by viewing the monitor **Grafana** -> **TiKV** - Solution: See [Troubleshooting Solution](/tidb-lightning/troubleshoot-tidb-lightning.md#checksum-failed-checksum-mismatched-remote-vs-local). -- 6.3.4 `Checkpoint for … has invalid status:(error code)` +- 6.2.4 `Checkpoint for … has invalid status:(error code)` - Cause: Checkpoint is enabled, and Lightning/Importer has previously abnormally exited. To prevent accidental data corruption, TiDB Lightning will not start until the error is addressed. The error code is an integer less than 25, with possible values as `0, 3, 6, 9, 12, 14, 15, 17, 18, 20 and 21`. The integer indicates the step where the unexpected exit occurs in the import process. The larger the integer is, the later the exit occurs. - Solution: See [Troubleshooting Solution](/tidb-lightning/troubleshoot-tidb-lightning.md#checkpoint-for--has-invalid-status-error-code). -- 6.3.5 `cannot guess encoding for input file, please convert to UTF-8 manually` +- 6.2.5 `cannot guess encoding for input file, please convert to UTF-8 manually` - Cause: TiDB Lightning only supports the UTF-8 and GB-18030 encodings. This error means the file is not in any of these encodings. It is also possible that the file has mixed encoding, such as containing a string in UTF-8 and another string in GB-18030, due to historical ALTER TABLE executions. - Solution: See [Troubleshooting Solution](/tidb-lightning/troubleshoot-tidb-lightning.md#cannot-guess-encoding-for-input-file-please-convert-to-utf-8-manually). -- 6.3.6 `[sql2kv] sql encode error = [types:1292]invalid time format: '{1970 1 1 0 45 0 0}'` +- 6.2.6 `[sql2kv] sql encode error = [types:1292]invalid time format: '{1970 1 1 0 45 0 0}'` - Cause: A timestamp type entry has a time value that does not exist. This is either because of DST changes or because the time value has exceeded the supported range (from Jan 1, 1970 to Jan 19, 2038). diff --git a/tiup/tiup-cluster-topology-reference.md b/tiup/tiup-cluster-topology-reference.md index 7933b2d40a3b6..c62bb83df8f1b 100644 --- a/tiup/tiup-cluster-topology-reference.md +++ b/tiup/tiup-cluster-topology-reference.md @@ -24,8 +24,6 @@ A topology configuration file for TiDB deployment using TiUP might contain the f - [tikv_servers](#tikv_servers): The configuration of the TiKV instance. This configuration specifies the machines to which the TiKV component is deployed. - [tiflash_servers](#tiflash_servers): The configuration of the TiFlash instance. This configuration specifies the machines to which the TiFlash component is deployed. - [tiproxy_servers](#tiproxy_servers): The configuration of the TiProxy instance. This configuration specifies the machines to which the TiProxy component is deployed. -- [pump_servers](#pump_servers): The configuration of the Pump instance. This configuration specifies the machines to which the Pump component is deployed. -- [drainer_servers](#drainer_servers): The configuration of the Drainer instance. This configuration specifies the machines to which the Drainer component is deployed. - [cdc_servers](#cdc_servers): The configuration of the TiCDC instance. This configuration specifies the machines to which the TiCDC component is deployed. - [tispark_masters](#tispark_masters): The configuration of the TiSpark master instance. This configuration specifies the machines to which the TiSpark master component is deployed. Only one node of TiSpark master can be deployed. - [tispark_workers](#tispark_workers): The configuration of the TiSpark worker instance. This configuration specifies the machines to which the TiSpark worker component is deployed. @@ -144,10 +142,6 @@ The above configuration specifies that `node_exporter` uses the `9100` port and - `tiproxy`: TiProxy service-related configuration. For the complete configuration, see [TiProxy configuration file](/tiproxy/tiproxy-configuration.md). -- `pump`: Pump service-related configuration. For the complete configuration, see [TiDB Binlog configuration file](/tidb-binlog/tidb-binlog-configuration-file.md#pump). - -- `drainer`: Drainer service-related configuration. For the complete configuration, see [TiDB Binlog configuration file](/tidb-binlog/tidb-binlog-configuration-file.md#drainer). - - `cdc`: TiCDC service-related configuration. For the complete configuration, see [Deploy TiCDC](/ticdc/deploy-ticdc.md). - `tso`: `tso` microservice-related configuration. For the complete configuration, see [TSO configuration file](/tso-configuration-file.md). @@ -189,8 +183,6 @@ Make sure you only configure it when you need to use a specific version of a com - `tiflash`: The version of the TiFlash component - `pd`: The version of the PD component - `tidb_dashboard`: The version of the standalone TiDB Dashboard component -- `pump`: The version of the Pump component -- `drainer`: The version of the Drainer component - `cdc`: The version of the CDC component - `kvcdc`: The version of the TiKV-CDC component - `tiproxy`: The version of the TiProxy component @@ -471,111 +463,6 @@ tiproxy_servers: - host: 10.0.1.22 ``` -### `pump_servers` - -`pump_servers` specifies the machines to which the Pump services of TiDB Binlog are deployed. It also specifies the service configuration on each machine. `pump_servers` is an array, and each element of the array contains the following fields: - -- `host`: Specifies the machine to which the Pump services are deployed. The field value is an IP address and is mandatory. - -- `ssh_port`: Specifies the SSH port to connect to the target machine for operations. If it is not specified, the `ssh_port` of the `global` section is used. - -- `port`: The listening port of the Pump services. The default value is `8250`. - -- `deploy_dir`: Specifies the deployment directory. If it is not specified or specified as a relative directory, the directory is generated according to the `deploy_dir` directory configured in `global`. - -- `data_dir`: Specifies the data directory. If it is not specified or specified as a relative directory, the directory is generated according to the `data_dir` directory configured in `global`. - -- `log_dir`: Specifies the log directory. If it is not specified or specified as a relative directory, the log is generated according to the `log_dir` directory configured in `global`. - -- `numa_node`: Allocates the NUMA policy to the instance. Before specifying this field, you need to make sure that the target machine has [numactl](https://linux.die.net/man/8/numactl) installed. If this field is specified, cpubind and membind policies are allocated using [numactl](https://linux.die.net/man/8/numactl). This field is the string type. The field value is the ID of the NUMA node, such as "0,1". - -- `config`: The configuration rule of this field is the same as the `pump` configuration rule in `server_configs`. If this field is configured, the field content is merged with the `pump` content in `server_configs` (if the two fields overlap, the content of this field takes effect). Then, a configuration file is generated and sent to the machine specified in `host`. - -- `os`: The operating system of the machine specified in `host`. If this field is not specified, the default value is the `os` value in `global`. - -- `arch`: The architecture of the machine specified in `host`. If this field is not specified, the default value is the `arch` value in `global`. - -- `resource_control`: Resource control for the service. If this field is configured, the field content is merged with the `resource_control` content in `global` (if the two fields overlap, the content of this field takes effect). Then, a systemd configuration file is generated and sent to the machine specified in `host`. The configuration rules of `resource_control` are the same as the `resource_control` content in `global`. - -For the above fields, you cannot modify these configured fields after the deployment: - -- `host` -- `port` -- `deploy_dir` -- `data_dir` -- `log_dir` -- `arch` -- `os` - -A `pump_servers` configuration example is as follows: - -```yaml -pump_servers: - - host: 10.0.1.21 - config: - gc: 7 - - host: 10.0.1.22 -``` - -### `drainer_servers` - -`drainer_servers` specifies the machines to which the Drainer services of TiDB Binlog are deployed. It also specifies the service configuration on each machine. `drainer_servers` is an array. Each array element contains the following fields: - -- `host`: Specifies the machine to which the Drainer services are deployed. The field value is an IP address and is mandatory. - -- `ssh_port`: Specifies the SSH port to connect to the target machine for operations. If it is not specified, the `ssh_port` of the `global` section is used. - -- `port`: The listening port of Drainer services. The default value is `8249`. - -- `deploy_dir`: Specifies the deployment directory. If it is not specified or specified as a relative directory, the directory is generated according to the `deploy_dir` directory configured in `global`. - -- `data_dir`: Specifies the data directory. If it is not specified or specified as a relative directory, the directory is generated according to the `data_dir` directory configured in `global`. - -- `log_dir`: Specifies the log directory. If it is not specified or specified as a relative directory, the log is generated according to the `log_dir` directory configured in `global`. - -- `commit_ts` (deprecated): When Drainer starts, it reads the checkpoint. If Drainer gets no checkpoint, it uses this field as the replication time point for the initial startup. This field defaults to `-1` (Drainer always gets the latest timestamp from the PD as the commit_ts). - -- `numa_node`: Allocates the NUMA policy to the instance. Before specifying this field, you need to make sure that the target machine has [numactl](https://linux.die.net/man/8/numactl) installed. If this field is specified, cpubind and membind policies are allocated using [numactl](https://linux.die.net/man/8/numactl). This field is the string type. The field value is the ID of the NUMA node, such as "0,1". - -- `config`: The configuration rule of this field is the same as the `drainer` configuration rule in `server_configs`. If this field is configured, the field content is merged with the `drainer` content in `server_configs` (if the two fields overlap, the content of this field takes effect). Then, a configuration file is generated and sent to the machine specified in `host`. - -- `os`: The operating system of the machine specified in `host`. If this field is not specified, the default value is the `os` value in `global`. - -- `arch`: The architecture of the machine specified in `host`. If this field is not specified, the default value is the `arch` value in `global`. - -- `resource_control`: Resource control for the service. If this field is configured, the field content is merged with the `resource_control` content in `global` (if the two fields overlap, the content of this field takes effect). Then, a systemd configuration file is generated and sent to the machine specified in `host`. The configuration rules of `resource_control` are the same as the `resource_control` content in `global`. - -For the above fields, you cannot modify these configured fields after the deployment: - -- `host` -- `port` -- `deploy_dir` -- `data_dir` -- `log_dir` -- `arch` -- `os` - -The `commit_ts` field is deprecated since TiUP v1.9.2 and is not recorded in the starting script of Drainer. If you still need to use this field, refer to the following example to configure the `initial-commit-ts` field in `config`. - -A `drainer_servers` configuration example is as follows: - -```yaml -drainer_servers: - - host: 10.0.1.21 - config: - initial-commit-ts: -1 - syncer.db-type: "mysql" - syncer.to.host: "127.0.0.1" - syncer.to.user: "root" - syncer.to.password: "" - syncer.to.port: 3306 - syncer.ignore-table: - - db-name: test - tbl-name: log - - db-name: test - tbl-name: audit -``` - ### `cdc_servers` `cdc_servers` specifies the machines to which the TiCDC services are deployed. It also specifies the service configuration on each machine. `cdc_servers` is an array. Each array element contains the following fields: diff --git a/tiup/tiup-cluster.md b/tiup/tiup-cluster.md index fae4e1bd2dcc7..bb9897ea8275a 100644 --- a/tiup/tiup-cluster.md +++ b/tiup/tiup-cluster.md @@ -241,12 +241,12 @@ For the PD component, `|L` or `|UI` might be appended to `Up` or `Down`. `|L` in Scaling in a cluster means making some node(s) offline. This operation removes the specific node(s) from the cluster and deletes the remaining files. -Because the offline process of the TiKV, TiFlash, and TiDB Binlog components is asynchronous (which requires removing the node through API), and the process takes a long time (which requires continuous observation on whether the node is successfully taken offline), special treatment is given to the TiKV, TiFlash, and TiDB Binlog components. +Because the offline process of the TiKV and TiFlash components is asynchronous (which requires removing the node through API), and the process takes a long time (which requires continuous observation on whether the node is successfully taken offline), special treatment is given to the TiKV and TiFlash components. -- For TiKV, TiFlash, and Binlog: +- For TiKV and TiFlash: - TiUP cluster takes the node offline through API and directly exits without waiting for the process to be completed. - - Afterwards, when a command related to the cluster operation is executed, TiUP cluster examines whether there is a TiKV, TiFlash, or Binlog node that has been taken offline. If not, TiUP cluster continues with the specified operation; If there is, TiUP cluster takes the following steps: + - Afterwards, when a command related to the cluster operation is executed, TiUP cluster examines whether there is a TiKV or TiFlash node that has been taken offline. If not, TiUP cluster continues with the specified operation; If there is, TiUP cluster takes the following steps: 1. Stop the service of the node that has been taken offline. 2. Clean up the data files related to the node. @@ -635,7 +635,7 @@ Before TiUP is released, you can control the cluster using `tidb-ctl`, `tikv-ctl ```bash Usage: - tiup ctl:v {tidb/pd/tikv/binlog/etcd} [flags] + tiup ctl:v {tidb/pd/tikv/etcd} [flags] Flags: -h, --help help for tiup @@ -647,7 +647,6 @@ This command has a corresponding relationship with those of the previous tools: tidb-ctl [args] = tiup ctl tidb [args] pd-ctl [args] = tiup ctl pd [args] tikv-ctl [args] = tiup ctl tikv [args] -binlogctl [args] = tiup ctl bindlog [args] etcdctl [args] = tiup ctl etcd [args] ``` diff --git a/tiup/tiup-component-cluster-patch.md b/tiup/tiup-component-cluster-patch.md index a1c3c99466247..7c5b079608987 100644 --- a/tiup/tiup-component-cluster-patch.md +++ b/tiup/tiup-component-cluster-patch.md @@ -8,7 +8,7 @@ summary: The `tiup cluster patch` command allows for dynamic replacement of bina If you need to dynamically replace the binaries of a service while the cluster is running (namely, keep the cluster available during the replacement process), you can use the `tiup cluster patch` command. After the command is executed, TiUP does the following things: - Uploads the binary package for replacement to the target machine. -- If the target service is a storage service such as TiKV, TiFlash, or TiDB Binlog, TiUP first takes the related nodes offline via the API. +- If the target service is a storage service such as TiKV or TiFlash, TiUP first takes the related nodes offline via the API. - Stops the target service. - Unpacks the binary package and replace the service. - Starts the target service. diff --git a/tiup/tiup-component-cluster-scale-in.md b/tiup/tiup-component-cluster-scale-in.md index bbd56b4ce8665..dbec4dba67a9d 100644 --- a/tiup/tiup-component-cluster-scale-in.md +++ b/tiup/tiup-component-cluster-scale-in.md @@ -1,6 +1,6 @@ --- title: tiup cluster scale-in -summary: The `tiup cluster scale-in` command is used to scale in the cluster by taking specified nodes offline, removing them from the cluster, and deleting remaining files. Components like TiKV, TiFlash, and TiDB Binlog are handled asynchronously and require additional steps to check and clean up. The command also includes options for node specification, forceful removal, transfer timeout, and help information. +summary: The `tiup cluster scale-in` command is used to scale in the cluster by taking specified nodes offline, removing them from the cluster, and deleting remaining files. Components like TiKV and TiFlash are handled asynchronously and require additional steps to check and clean up. The command also includes options for node specification, forceful removal, transfer timeout, and help information. --- # tiup cluster scale-in @@ -9,9 +9,9 @@ The `tiup cluster scale-in` command is used to scale in the cluster, which takes ## Particular handling of components' offline process -Because the TiKV, TiFlash, and TiDB Binlog components are taken offline asynchronously (which requires TiUP to remove the node through API first) and the stopping process takes a long time (which requires TiUP to continuously check whether the node is successfully taken offline), the TiKV, TiFlash, and TiDB Binlog components are handled particularly as follows: +Because the TiKV and TiFlash components are taken offline asynchronously (which requires TiUP to remove the node through API first) and the stopping process takes a long time (which requires TiUP to continuously check whether the node is successfully taken offline), the TiKV and TiFlash components are handled particularly as follows: -- For TiKV, TiFlash, and TiDB Binlog components: +- For TiKV and TiFlash components: 1. TiUP Cluster takes the node offline through API and directly exits without waiting for the process to be completed. 2. To check the status of the nodes being scaled in, you need to execute the `tiup cluster display` command and wait for the status to become `Tombstone`. diff --git a/tiup/tiup-playground.md b/tiup/tiup-playground.md index b88596275e21a..d3d7c6a9fe7d1 100644 --- a/tiup/tiup-playground.md +++ b/tiup/tiup-playground.md @@ -36,9 +36,6 @@ Flags: --db.binpath string Specify the TiDB instance binary path (optional, for debugging) --db.config string Specify the TiDB instance configuration file (optional, for debugging) --db.timeout int Specify TiDB maximum wait time in seconds for starting. 0 means no limit - --drainer int Specify Drainer data of the cluster - --drainer.binpath string Specify the location of the Drainer binary files (optional, for debugging) - --drainer.config string Specify the Drainer configuration file -h, --help Display help information for TiUP --host string Specify the listening address of each component (default: `127.0.0.1`). Set it to `0.0.0.0` if provided for access of other machines --kv int Specify the number of TiKV instances (default: 1) @@ -50,9 +47,6 @@ Flags: --pd.binpath string Specify the PD instance binary path (optional, for debugging) --pd.config string Specify the PD instance configuration file (optional, for debugging) --pd.mode string Specify the PD working mode. The optional value is 'ms'. Specifying this flag means enabling PD microservice mode. - --pump int Specify the number of Pump instances. If the value is not `0`, TiDB Binlog is enabled. - --pump.binpath string Specify the location of the Pump binary files (optional, for debugging) - --pump.config string Specify the Pump configuration file (optional, for debugging) --scheduling int Specify the number of Scheduling instances (default: 1),which can be set only when `pd.mode` is 'ms' --scheduling.host host Specify the listening address of the Scheduling instance --scheduling.binpath string Specify the Scheduling instance binary path (optional, for debugging) @@ -160,10 +154,8 @@ Pid Role Uptime --- ---- ------ 84518 pd 35m22.929404512s 84519 tikv 35m22.927757153s -84520 pump 35m22.92618275s 86189 tidb exited 86526 tidb 34m28.293148663s -86190 drainer 35m19.91349249s ``` ## Scale out a cluster diff --git a/transaction-overview.md b/transaction-overview.md index 7467137eb7d5a..d6c6c8fe5eefc 100644 --- a/transaction-overview.md +++ b/transaction-overview.md @@ -297,12 +297,6 @@ By default, TiDB sets the total size of a single transaction to no more than 100 TiDB previously limited the total number of key-value pairs for a single transaction to 300,000. This restriction was removed in TiDB v4.0. -> **Note:** -> -> Usually, TiDB Binlog (deprecated) is enabled to replicate data to the downstream. In some scenarios, message middleware such as Kafka is used to consume binlogs that are replicated to the downstream. -> -> Taking Kafka as an example, the upper limit of Kafka's single message processing capability is 1 GB. Therefore, when `txn-total-size-limit` is set to more than 1 GB, it might happen that the transaction is successfully executed in TiDB, but the downstream Kafka reports an error. To avoid this situation, you need to decide the actual value of `txn-total-size-limit` according to the limit of the end consumer. For example, if Kafka is used downstream, `txn-total-size-limit` must not exceed 1 GB. - ## Causal consistency > **Note:** diff --git a/upgrade-tidb-using-tiup.md b/upgrade-tidb-using-tiup.md index 0974edaf4d5df..cb47c50a7e2b4 100644 --- a/upgrade-tidb-using-tiup.md +++ b/upgrade-tidb-using-tiup.md @@ -63,7 +63,7 @@ This document is targeted for the following upgrade paths: 2. Use TiUP (`tiup cluster`) to import the TiDB Ansible configuration. 3. Update the 3.0 version to 4.0 according to [Upgrade TiDB Using TiUP (v4.0)](https://docs.pingcap.com/tidb/v4.0/upgrade-tidb-using-tiup#import-tidb-ansible-and-the-inventoryini-configuration-to-tiup). 4. Upgrade the cluster to v8.3.0 according to this document. -- Support upgrading the versions of TiDB Binlog (deprecated), TiCDC, TiFlash, and other components. +- Support upgrading the versions of TiCDC, TiFlash, and other components. - When upgrading TiFlash from versions earlier than v6.3.0 to v6.3.0 and later versions, note that the CPU must support the AVX2 instruction set under the Linux AMD64 architecture and the ARMv8 instruction set architecture under the Linux ARM64 architecture. For details, see the description in [v6.3.0 Release Notes](/releases/release-6.3.0.md#others). - For detailed compatibility changes of different versions, see the [Release Notes](/releases/release-notes.md) of each version. Modify your cluster configuration according to the "Compatibility Changes" section of the corresponding release notes. - When updating clusters from versions earlier than v5.3 to v5.3 or later versions, note that there is a time format change in the alerts generated by the default deployed Prometheus. This format change is introduced starting from Prometheus v2.27.1. For more information, see [Prometheus commit](https://github.com/prometheus/prometheus/commit/7646cbca328278585be15fa615e22f2a50b47d06). @@ -230,8 +230,6 @@ tiup cluster upgrade v8.3.0 > + To keep a stable performance, make sure that all leaders in a TiKV instance are evicted before stopping the instance. You can set `--transfer-timeout` to a larger value, for example, `--transfer-timeout 3600` (unit: second). > > + To upgrade TiFlash from versions earlier than v5.3.0 to v5.3.0 or later, you must stop TiFlash and then upgrade it, and the TiUP version must be earlier than v1.12.0. For more information, see [Upgrade TiFlash using TiUP](/tiflash-upgrade-guide.md#upgrade-tiflash-using-tiup). -> -> + Try to avoid creating a new clustered index table when you apply rolling updates to the clusters using TiDB Binlog. #### Specify the component version during upgrade