Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tiproxy: add traffic replay doc #18948

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -639,6 +639,7 @@
- TiProxy
- [Overview](/tiproxy/tiproxy-overview.md)
- [Load Balancing Policies](/tiproxy/tiproxy-load-balance.md)
- [Traffic Replay](/tiproxy/tiproxy-traffic-replay.md)
- [Configuration](/tiproxy/tiproxy-configuration.md)
- [Command Line Parameters](/tiproxy/tiproxy-command-line-flags.md)
- [Monitoring Metrics](/tiproxy/tiproxy-grafana.md)
Expand Down
Binary file added media/tiproxy/tiproxy-traffic-replay.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
137 changes: 128 additions & 9 deletions tiproxy/tiproxy-command-line-flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,42 @@ This section lists the flags of the server program `tiproxy`.

## TiProxy Control

This section introduces the syntax, options, and commands of the client program `tiproxyctl`.
This section introduces the installation methods, syntax, options, and commands of the client program `tiproxyctl`.

### Install TiProxy Control

You can install TiProxy Control using one of the following two methods.

> **Note:**
>
> TiProxy Control is specifically designed for debugging purposes and might not be fully compatible with future capabilities introduced in TiProxy. It's not recommended to include this tool in application or utility development to get information.

#### Install using TiUP

After installing [TiUP](/tiup/tiup-overview.md), you can use the `tiup install tiproxy` command to download and install the binary programs for TiProxy and TiProxy Control. After installation, you can use `tiup --binary tiproxy` to view the installation path of TiProxy. TiProxy Control is located in the same directory as TiProxy.

For example:

```shell
tiup install tiproxy
# download https://tiup-mirrors.pingcap.com/tiproxy-v1.3.0-linux-amd64.tar.gz 22.51 MiB / 22.51 MiB 100.00% 13.99 MiB/s
ls `tiup --binary tiproxy`ctl
# /root/.tiup/components/tiproxy/v1.3.0/tiproxyctl
```

#### Compile from source code

Compilation environment requirement: [Go](https://golang.org/) 1.21 or later

Compilation procedures: Go to the root directory of the [TiProxy project](https://github.com/pingcap/tiproxy), use the `make` command to compile, and generate `tiproxyctl`.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

```shell
git clone https://github.com/pingcap/tiproxy.git
cd tiproxy
make
ls bin/tiproxyctl
```

### Syntax

```
Expand All @@ -42,11 +72,23 @@ tiproxyctl [flags] [command]
For example:

```
tiproxyctl --curls 127.0.0.1:3080 config get
tiproxyctl --host 127.0.0.1 --port 3080 config get
```

### Options

#### `--host`

+ Specifies the TiProxy server address.
+ Type: `string`
+ Default: `localhost`

#### `--port`

+ Specifies the port number of the TiProxy API gateway.
+ Type: `int`
+ Default: `3080`

#### `--log_encoder`

+ Specifies the log format of `tiproxyctl`.
Expand All @@ -64,13 +106,6 @@ tiproxyctl --curls 127.0.0.1:3080 config get
+ Default: `"warn"`
+ You can specify `debug`, `info`, `warn`, `error`, `panic`.

#### `--curls`

+ Specifies the server addresses. You can add multiple listening addresses.
+ Type: `comma separated lists of ip:port`
+ Default: `localhost:3080`
+ Server API gateway addresses.

#### `-k, --insecure`

+ Specifies whether to skip TLS CA verification when dialing to the server.
Expand Down Expand Up @@ -121,3 +156,87 @@ Example output:
```json
{"config_checksum":3006078629}
```

#### `traffic capture`

The `tiproxyctl traffic capture` command is used to capture traffic.

Options:

- `--output`: (required) specifies the directory to store traffic files.
- `--duration`: (required) specifies the duration of capture. The unit is one of `m` (minutes), `h` (hours), or `d` (days). For example, `--duration=1h` captures traffic for one hour.

Example:

The following command connects to the TiProxy instance at `10.0.1.10:3080`, captures traffic for one hour, and saves it to the `/tmp/traffic` directory on the TiProxy instance:

```shell
tiproxyctl traffic capture --host 10.0.1.10 --port 3080 --output="/tmp/traffic" --duration=1h
```

#### `traffic replay`

The `tiproxyctl traffic replay` command is used to replay captured traffic.

Options:

- `--username`: (required) specifies the database username for replay.
- `--password`: (optional) specifies the password for the username. The default value is an empty string `""`.
- `--input`: (required) specifies the directory containing traffic files.
- `--speed`: (optional) specifies the replay speed multiplier. The range is `[0.1, 10]`. The default value of 1 indicates replay at original speed.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

Example:

The following command connects to the TiProxy instance at `10.0.1.10:3080` using username `u1` and password `123456`, reads traffic files from the `/tmp/traffic` directory on the TiProxy instance, and replays traffic at twice the original speed:
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

```shell
tiproxyctl traffic replay --host 10.0.1.10 --port 3080 --username="u1" --password="123456" --input="/tmp/traffic" --speed=2
```

#### `traffic cancel`

The `tiproxyctl traffic cancel` command is used to cancel the current capture or replay task.

#### `traffic show`

The `tiproxyctl traffic show` command is used to display historical capture and replay tasks.

The `status` field in the output indicates the task status, with the following possible values:

- `done`: task completed normally.
- `canceled`: task was canceled. You can check the `error` field for the reason.
- `running`: task is running. You can check the `progress` field for completion percentage.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

Example output:

```json
[
{
"type": "capture",
"start_time": "2024-09-01T14:30:40.99096+08:00",
"end_time": "2024-09-01T16:30:40.99096+08:00",
"duration": "2h",
"output": "/tmp/traffic",
"progress": "100%",
"status": "done"
},
{
"type": "capture",
"start_time": "2024-09-02T18:30:40.99096+08:00",
"end_time": "2024-09-02T19:00:40.99096+08:00",
"duration": "2h",
"output": "/tmp/traffic",
"progress": "25%",
"status": "canceled",
"error": "canceled manually"
},
{
"type": "capture",
"start_time": "2024-09-03T13:31:40.99096+08:00",
"duration": "2h",
"output": "/tmp/traffic",
"progress": "45%",
"status": "running"
}
]
```
153 changes: 153 additions & 0 deletions tiproxy/tiproxy-traffic-replay.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
---
title: TiProxy Traffic Replay
summary: Introduce the use cases and steps for the TiProxy traffic replay feature.
---

# TiProxy Traffic Replay

> **Warning:**
>
> Currently, the TiProxy traffic replay feature is experimental. It is not recommended that you use it in production environments. This feature might be changed or removed without prior notice. If you find a bug, you can report an [issue](https://github.com/pingcap/tiproxy/issues) on GitHub.

Starting from TiProxy v1.3.0, you can use TiProxy to capture access traffic in a TiDB production cluster and replay it in a test cluster at a specified rate. This feature enables you to reproduce actual workloads from the production cluster in a test environment, verifying SQL statement execution results and performance.

<img src="https://download.pingcap.com/images/docs/tiproxy/tiproxy-traffic-replay.png" alt="TiProxy traffic replay" width="800" />

## Use cases

Traffic replay is suitable for the following scenarios:

- **Validate TiDB version upgrades**: Replay production traffic on a test cluster with the new version to verify if the new TiDB version can successfully execute all SQL statements.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved
- **Assess change impact**: Simulate production traffic on a test cluster to validate the impact of changes on the cluster. For example, verify the effects before changing configuration items or system variables, altering table structures, or enabling new TiDB features.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved
- **Validate performance before TiDB scaling**: Replay traffic at corresponding rates on a test cluster with the new scale to verify if the performance meets requirements. For example, to plan a 50% cluster downscale for cost savings, replay traffic at half speed to verify if SQL latency meets requirements after scaling.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved
- **Test performance limits**: Replay traffic multiple times on a test cluster of the same scale, increasing the replay rate each time to test the throughput limit of that scale and assess whether performance meets future business growth needs.

Traffic replay is not suitable for the following scenarios:

- Verify SQL compatibility between TiDB and MySQL: TiProxy only supports reading traffic files it generates and cannot capture traffic from MySQL for replay on TiDB.
- Compare SQL execution results between TiDB versions: TiProxy only verifies if SQL statements execute successfully and does not compare results.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

## Usage

1. Prepare the test environment:

1. Create a test cluster. For more information, see [Deploy a TiDB Cluster Using TiUP](/production-deployment-using-tiup.md).
2. Install `tiproxyctl` and ensure the host with `tiproxyctl` can connect to TiProxy instances in both production and test clusters. For more information, see [Install TiProxy Control](/tiproxy/tiproxy-command-line-flags.md#install-tiproxy-control).
3. Replicate data from the production cluster to the test cluster. For more information, see [Data Migration Overview](/migration-overview.md).
4. Run the [`ANALYZE`](/sql-statements/sql-statement-analyze-table.md) statement in the test cluster to update statistics.

2. Use the [`tiproxyctl traffic capture`](/tiproxy/tiproxy-command-line-flags.md#traffic-capture) command to connect to the production cluster's TiProxy instance and start capturing traffic.

> **Note:**
>
> - TiProxy captures traffic on all connections, including existing and newly created ones.
> - In TiProxy primary-secondary mode, connect to the primary TiProxy instance.
> - If TiProxy is configured with a virtual IP, it is recommended to connect to the virtual IP address.

For example, the following command connects to the TiProxy instance at `10.0.1.10:3080`, captures traffic for one hour, and saves it to the `/tmp/traffic` directory on the TiProxy instance:

```shell
tiproxyctl traffic capture --host 10.0.1.10 --port 3080 --output="/tmp/traffic" --duration=1h
```

Traffic files are automatically rotated and compressed. Example files in the `/tmp/traffic` directory:

```shell
ls /tmp/traffic
# meta traffic-2024-08-29T17-37-12.477.log.gz traffic-2024-08-29T17-43-11.166.log.gz traffic.log
```

For more information, see [`tiproxyctl traffic capture`](/tiproxy/tiproxy-command-line-flags.md#traffic-capture).

3. Copy the traffic file directory to the test cluster's TiProxy instance.
4. Use [`tiproxyctl traffic replay`](/tiproxy/tiproxy-command-line-flags.md#traffic-replay) to connect to the test cluster's TiProxy instance and start replaying traffic.

By default, SQL statements are executed at the same rate as in the production cluster, and database connections correspond one-to-one with the production cluster to simulate the production load and ensure consistent transaction execution order.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

For example, the following command connects to the TiProxy instance at `10.0.1.10:3080` using username `u1` and password `123456`, reads traffic files from the `/tmp/traffic` directory on the TiProxy instance, and replays the traffic:

```shell
tiproxyctl traffic replay --host 10.0.1.10 --port 3080 --username="u1" --password="123456" --input="/tmp/traffic"
```

Since all traffic runs under user `u1`, ensure `u1` can access all databases and tables. If no such user exists, create one.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

For more information, see [`tiproxyctl traffic replay`](/tiproxy/tiproxy-command-line-flags.md#traffic-replay).

5. View the replay report.

After replay completion, the report is stored in the `tiproxy_traffic_report` database on the test cluster. This database contains two tables: `fail` and `other_errors`.

The `fail` table stores failed SQL statements, with the following fields:

- `cmd_type`: the type of failed SQL statement, such as `Query` (execute an ordinary statement), `Prepare` (prepare a statement), and `Execute` (execute a prepared statement).
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved
- `digest`: the digest of the failed SQL statement.
- `sample_stmt`: the SQL text when the statement first failed.
- `sample_err_msg`: the error message when the SQL statement failed.
- `sample_conn_id`: the connection ID recorded in the traffic file for the SQL statement. You can use this to view the execution context in the traffic file.
- `sample_capture_time`: the execution time recorded in the traffic file for the SQL statement. You can use this to view the execution context in the traffic file.
- `sample_replay_time`: the time when the SQL statement failed during replay. You can use this to view error information in the TiDB log file.
- `count`: the number of times the SQL statement failed.

The `other_errors` table stores unexpected errors, such as network errors or database connection errors, with the following fields:

- `err_type`: the type of error, represented by a brief error message. For example, `i/o timeout`.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved
- `sample_err_msg`: the complete error message when the error first occurred.
- `sample_replay_time`: the time when the error occurred during replay. You can use this to view error information in the TiDB log file.
- `count`: the number of occurrences for this error.

> **Note:**
>
> The table structure of `tiproxy_traffic_report` might change in future versions. It is not recommended to directly read data from `tiproxy_traffic_report` in your application or tool development.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

## Test throughput

To test cluster throughput, use the `--speed` option to adjust the replay rate.

For example, `--speed=2` executes SQL statements at twice the rate, halving the total replay time:
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

```shell
tiproxyctl traffic replay --host 10.0.1.10 --port 3080 --username="u1" --password="123456" --input="/tmp/traffic" --speed=2
```

Increasing the replay rate only reduces idle time between SQL statements and does not increase the number of connections. When session idle time is already short, increasing the speed might not effectively improve throughput. In such cases, you can deploy multiple TiProxy instances to replay the same traffic files simultaneously, increasing concurrency to improve throughput.

## View and manage tasks

During capture and replay, tasks automatically stop if unknown errors occur. To view current task progress or error information from the last task, use the [`tiproxyctl traffic show`](/tiproxy/tiproxy-command-line-flags.md#traffic-show) command:
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved

```shell
tiproxyctl traffic show --host 10.0.1.10 --port 3080
```

For example, the following output indicates a running capture task:

```json
[
{
"type": "capture",
"start_time": "2024-09-03T09:10:58.220644+08:00",
"duration": "2h",
"output": "/tmp/traffic",
"progress": "45%",
"status": "running"
}
]
```

For more information, see [`tiproxyctl traffic show`](/tiproxy/tiproxy-command-line-flags.md#traffic-show).

To cancel the current capture or replay task, use the [`tiproxyctl traffic cancel`](/tiproxy/tiproxy-command-line-flags.md#traffic-cancel) command:

```shell
tiproxyctl traffic cancel --host 10.0.1.10 --port 3080
```

For more information, see [`tiproxyctl traffic cancel`](/tiproxy/tiproxy-command-line-flags.md#traffic-cancel).

## Limitations

- TiProxy only supports replaying traffic files captured by TiProxy and does not support other file formats. Therefore, the production cluster must first use TiProxy to capture traffic.
Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved
- TiProxy traffic replay does not support filtering SQL types and DML and DDL statements are replayed. Therefore, you need to restore the cluster data to its pre-replay state before replaying.
- TiProxy traffic replay does not support testing [Resource Control](/tidb-resource-control.md) and [privilege management](/privilege-management.md) because TiProxy uses the same username to replay traffic.
- TiProxy does not support replaying [`LOAD DATA`](/sql-statements/sql-statement-load-data.md) statements.
Loading