Skip to content

Commit

Permalink
tiproxy: add traffic replay doc (#18948)
Browse files Browse the repository at this point in the history
  • Loading branch information
Oreoxmt authored Oct 16, 2024
1 parent 21b15e0 commit 9f5aa49
Show file tree
Hide file tree
Showing 4 changed files with 282 additions and 9 deletions.
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -626,6 +626,7 @@
- TiProxy
- [Overview](/tiproxy/tiproxy-overview.md)
- [Load Balancing Policies](/tiproxy/tiproxy-load-balance.md)
- [Traffic Replay](/tiproxy/tiproxy-traffic-replay.md)
- [Configuration](/tiproxy/tiproxy-configuration.md)
- [Command Line Parameters](/tiproxy/tiproxy-command-line-flags.md)
- [Monitoring Metrics](/tiproxy/tiproxy-grafana.md)
Expand Down
Binary file added media/tiproxy/tiproxy-traffic-replay.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
137 changes: 128 additions & 9 deletions tiproxy/tiproxy-command-line-flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,42 @@ This section lists the flags of the server program `tiproxy`.

## TiProxy Control

This section introduces the syntax, options, and commands of the client program `tiproxyctl`.
This section introduces the installation methods, syntax, options, and commands of the client program `tiproxyctl`.

### Install TiProxy Control

You can install TiProxy Control using one of the following two methods.

> **Note:**
>
> TiProxy Control is specifically designed for debugging purposes and might not be fully compatible with future capabilities introduced in TiProxy. It's not recommended to include this tool in application or utility development to get information.
#### Install using TiUP

After installing [TiUP](/tiup/tiup-overview.md), you can use the `tiup install tiproxy` command to download and install the binary programs for TiProxy and TiProxy Control. After installation, you can use `tiup --binary tiproxy` to view the installation path of TiProxy. TiProxy Control is located in the same directory as TiProxy.

For example:

```shell
tiup install tiproxy
# download https://tiup-mirrors.pingcap.com/tiproxy-v1.3.0-linux-amd64.tar.gz 22.51 MiB / 22.51 MiB 100.00% 13.99 MiB/s
ls `tiup --binary tiproxy`ctl
# /root/.tiup/components/tiproxy/v1.3.0/tiproxyctl
```

#### Compile from source code

Compilation environment requirement: [Go](https://golang.org/) 1.21 or later

Compilation procedures: Go to the root directory of the [TiProxy project](https://github.com/pingcap/tiproxy), use the `make` command to compile and generate `tiproxyctl`.

```shell
git clone https://github.com/pingcap/tiproxy.git
cd tiproxy
make
ls bin/tiproxyctl
```

### Syntax

```
Expand All @@ -42,11 +72,23 @@ tiproxyctl [flags] [command]
For example:

```
tiproxyctl --curls 127.0.0.1:3080 config get
tiproxyctl --host 127.0.0.1 --port 3080 config get
```

### Options

#### `--host`

+ Specifies the TiProxy server address.
+ Type: `string`
+ Default: `localhost`

#### `--port`

+ Specifies the port number of the TiProxy API gateway.
+ Type: `int`
+ Default: `3080`

#### `--log_encoder`

+ Specifies the log format of `tiproxyctl`.
Expand All @@ -64,13 +106,6 @@ tiproxyctl --curls 127.0.0.1:3080 config get
+ Default: `"warn"`
+ You can specify `debug`, `info`, `warn`, `error`, `panic`.

#### `--curls`

+ Specifies the server addresses. You can add multiple listening addresses.
+ Type: `comma separated lists of ip:port`
+ Default: `localhost:3080`
+ Server API gateway addresses.

#### `-k, --insecure`

+ Specifies whether to skip TLS CA verification when dialing to the server.
Expand Down Expand Up @@ -121,3 +156,87 @@ Example output:
```json
{"config_checksum":3006078629}
```

#### `traffic capture`

The `tiproxyctl traffic capture` command is used to capture traffic.

Options:

- `--output`: (required) specifies the directory to store traffic files.
- `--duration`: (required) specifies the duration of capture. The unit is one of `m` (minutes), `h` (hours), or `d` (days). For example, `--duration=1h` captures traffic for one hour.

Example:

The following command connects to the TiProxy instance at `10.0.1.10:3080`, captures traffic for one hour, and saves it to the `/tmp/traffic` directory on the TiProxy instance:

```shell
tiproxyctl traffic capture --host 10.0.1.10 --port 3080 --output="/tmp/traffic" --duration=1h
```

#### `traffic replay`

The `tiproxyctl traffic replay` command is used to replay captured traffic.

Options:

- `--username`: (required) specifies the database username for replay.
- `--password`: (optional) specifies the password for the username. The default value is an empty string `""`.
- `--input`: (required) specifies the directory containing traffic files.
- `--speed`: (optional) specifies the replay speed multiplier. The range is `[0.1, 10]`. The default value is `1`, indicating replay at the original speed.

Example:

The following command connects to the TiProxy instance at `10.0.1.10:3080` using username `u1` and password `123456`, reads traffic files from the `/tmp/traffic` directory on the TiProxy instance, and replays the traffic at twice the original speed:

```shell
tiproxyctl traffic replay --host 10.0.1.10 --port 3080 --username="u1" --password="123456" --input="/tmp/traffic" --speed=2
```

#### `traffic cancel`

The `tiproxyctl traffic cancel` command is used to cancel the current capture or replay task.

#### `traffic show`

The `tiproxyctl traffic show` command is used to display historical capture and replay tasks.

The `status` field in the output indicates the task status, with the following possible values:

- `done`: the task completed normally.
- `canceled`: the task was canceled. You can check the `error` field for the reason.
- `running`: the task is running. You can check the `progress` field for the completion percentage.

Example output:

```json
[
{
"type": "capture",
"start_time": "2024-09-01T14:30:40.99096+08:00",
"end_time": "2024-09-01T16:30:40.99096+08:00",
"duration": "2h",
"output": "/tmp/traffic",
"progress": "100%",
"status": "done"
},
{
"type": "capture",
"start_time": "2024-09-02T18:30:40.99096+08:00",
"end_time": "2024-09-02T19:00:40.99096+08:00",
"duration": "2h",
"output": "/tmp/traffic",
"progress": "25%",
"status": "canceled",
"error": "canceled manually"
},
{
"type": "capture",
"start_time": "2024-09-03T13:31:40.99096+08:00",
"duration": "2h",
"output": "/tmp/traffic",
"progress": "45%",
"status": "running"
}
]
```
153 changes: 153 additions & 0 deletions tiproxy/tiproxy-traffic-replay.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
---
title: TiProxy Traffic Replay
summary: Introduce the use cases and steps for the TiProxy traffic replay feature.
---

# TiProxy Traffic Replay

> **Warning:**
>
> Currently, the TiProxy traffic replay feature is experimental. It is not recommended that you use it in production environments. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tiproxy/issues) on GitHub.
Starting from TiProxy v1.3.0, you can use TiProxy to capture access traffic in a TiDB production cluster and replay it in a test cluster at a specified rate. This feature enables you to reproduce actual workloads from the production cluster in a test environment, verifying SQL statement execution results and performance.

<img src="https://download.pingcap.com/images/docs/tiproxy/tiproxy-traffic-replay.png" alt="TiProxy traffic replay" width="800" />

## Use cases

Traffic replay is suitable for the following scenarios:

- **Verify TiDB version upgrades**: Replay production traffic on a test cluster with a new TiDB version to verify that the new TiDB version can successfully execute all SQL statements.
- **Assess change impact**: Simulate production traffic on a test cluster to verify the impact of changes on the cluster. For example, verify the effects before modifying configuration items or system variables, altering table schemas, or enabling new TiDB features.
- **Validate performance before TiDB scaling**: Replay traffic at corresponding rates on a test cluster with a new scale to validate whether the performance meets requirements. For example, to plan a 50% cluster downscale for cost savings, replay traffic at half speed to validate if SQL latency meets requirements after scaling.
- **Test performance limits**: Replay traffic multiple times on a test cluster of the same scale, increasing the replay rate each time to test the throughput limit of that scale and assess whether performance meets future business growth needs.

Traffic replay is not suitable for the following scenarios:

- Verify SQL compatibility between TiDB and MySQL: TiProxy only supports reading traffic files it generates and cannot capture traffic from MySQL for replay on TiDB.
- Compare SQL execution results between TiDB versions: TiProxy only verifies if SQL statements execute successfully but does not compare results.

## Usage

1. Prepare the test environment:

1. Create a test cluster. For more information, see [Deploy a TiDB Cluster Using TiUP](/production-deployment-using-tiup.md).
2. Install `tiproxyctl` and ensure the host with `tiproxyctl` can connect to TiProxy instances in both production and test clusters. For more information, see [Install TiProxy Control](/tiproxy/tiproxy-command-line-flags.md#install-tiproxy-control).
3. Replicate data from the production cluster to the test cluster. For more information, see [Data Migration Overview](/migration-overview.md).
4. Run the [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) statement in the test cluster to update statistics.

2. Use the [`tiproxyctl traffic capture`](/tiproxy/tiproxy-command-line-flags.md#traffic-capture) command to connect to the production cluster's TiProxy instance and start capturing traffic.

> **Note:**
>
> - TiProxy captures traffic on all connections, including existing and newly created ones.
> - In TiProxy primary-secondary mode, connect to the primary TiProxy instance.
> - If TiProxy is configured with a virtual IP, it is recommended to connect to the virtual IP address.
For example, the following command connects to the TiProxy instance at `10.0.1.10:3080`, captures traffic for one hour, and saves it to the `/tmp/traffic` directory on the TiProxy instance:

```shell
tiproxyctl traffic capture --host 10.0.1.10 --port 3080 --output="/tmp/traffic" --duration=1h
```

Traffic files are automatically rotated and compressed. Example files in the `/tmp/traffic` directory:

```shell
ls /tmp/traffic
# meta traffic-2024-08-29T17-37-12.477.log.gz traffic-2024-08-29T17-43-11.166.log.gz traffic.log
```

For more information, see [`tiproxyctl traffic capture`](/tiproxy/tiproxy-command-line-flags.md#traffic-capture).

3. Copy the traffic file directory to the test cluster's TiProxy instance.
4. Use [`tiproxyctl traffic replay`](/tiproxy/tiproxy-command-line-flags.md#traffic-replay) to connect to the test cluster's TiProxy instance and start replaying traffic.

By default, SQL statements are executed at the same rate as in the production cluster, and each database connection corresponds to a connection in the production cluster to simulate the production load and ensure consistent transaction execution order.

For example, the following command connects to the TiProxy instance at `10.0.1.10:3080` using username `u1` and password `123456`, reads traffic files from the `/tmp/traffic` directory on the TiProxy instance, and replays the traffic:

```shell
tiproxyctl traffic replay --host 10.0.1.10 --port 3080 --username="u1" --password="123456" --input="/tmp/traffic"
```

Because all traffic runs under user `u1`, ensure `u1` can access all databases and tables. If no such user exists, create one.

For more information, see [`tiproxyctl traffic replay`](/tiproxy/tiproxy-command-line-flags.md#traffic-replay).

5. View the replay report.

After replay completion, the report is stored in the `tiproxy_traffic_report` database on the test cluster. This database contains two tables: `fail` and `other_errors`.

The `fail` table stores failed SQL statements, with the following fields:

- `cmd_type`: the type of a failed command, such as `Query` (execute an ordinary statement), `Prepare` (prepare a statement), and `Execute` (execute a prepared statement).
- `digest`: the digest of the failed SQL statement.
- `sample_stmt`: the SQL text when the statement first failed.
- `sample_err_msg`: the error message when the SQL statement failed.
- `sample_conn_id`: the connection ID recorded in the traffic file for the SQL statement. You can use this to view the execution context in the traffic file.
- `sample_capture_time`: the execution time recorded in the traffic file for the SQL statement. You can use this to view the execution context in the traffic file.
- `sample_replay_time`: the time when the SQL statement failed during replay. You can use this to view error information in the TiDB log file.
- `count`: the number of times the SQL statement failed.

The `other_errors` table stores unexpected errors, such as network errors or database connection errors, with the following fields:

- `err_type`: the type of error, presented as a brief error message. For example, `i/o timeout`.
- `sample_err_msg`: the complete error message when the error first occurred.
- `sample_replay_time`: the time when the error occurred during replay. You can use this to view error information in the TiDB log file.
- `count`: the number of occurrences for this error.

> **Note:**
>
> The table schema of `tiproxy_traffic_report` might change in future versions. It is not recommended to directly read data from `tiproxy_traffic_report` in your application or tool development.

## Test throughput

To test cluster throughput, use the `--speed` option to adjust the replay rate.

For example, `--speed=2` executes SQL statements at twice the rate, reducing the total replay time by half:

```shell
tiproxyctl traffic replay --host 10.0.1.10 --port 3080 --username="u1" --password="123456" --input="/tmp/traffic" --speed=2
```

Increasing the replay rate only reduces idle time between SQL statements and does not increase the number of connections. When session idle time is already short, increasing the speed might not effectively improve throughput. In such cases, you can deploy multiple TiProxy instances to replay the same traffic files simultaneously, increasing concurrency to improve throughput.

## View and manage tasks

During capture and replay, tasks automatically stop if unknown errors occur. To view the current task progress or error information from the last task, use the [`tiproxyctl traffic show`](/tiproxy/tiproxy-command-line-flags.md#traffic-show) command:

```shell
tiproxyctl traffic show --host 10.0.1.10 --port 3080
```

For example, the following output indicates a running capture task:

```json
[
{
"type": "capture",
"start_time": "2024-09-03T09:10:58.220644+08:00",
"duration": "2h",
"output": "/tmp/traffic",
"progress": "45%",
"status": "running"
}
]
```

For more information, see [`tiproxyctl traffic show`](/tiproxy/tiproxy-command-line-flags.md#traffic-show).

To cancel the current capture or replay task, use the [`tiproxyctl traffic cancel`](/tiproxy/tiproxy-command-line-flags.md#traffic-cancel) command:

```shell
tiproxyctl traffic cancel --host 10.0.1.10 --port 3080
```

For more information, see [`tiproxyctl traffic cancel`](/tiproxy/tiproxy-command-line-flags.md#traffic-cancel).

## Limitations

- TiProxy only supports replaying traffic files captured by TiProxy and does not support other file formats. Therefore, make sure to capture traffic from the production cluster using TiProxy first.
- TiProxy traffic replay does not support filtering SQL types and DML and DDL statements are replayed. Therefore, you need to restore the cluster data to its pre-replay state before replaying again.
- TiProxy traffic replay does not support testing [Resource Control](/tidb-resource-control.md) and [privilege management](/privilege-management.md) because TiProxy uses the same username to replay traffic.
- TiProxy does not support replaying [`LOAD DATA`](/sql-statements/sql-statement-load-data.md) statements.

0 comments on commit 9f5aa49

Please sign in to comment.