diff --git a/tiproxy/tiproxy-grafana.md b/tiproxy/tiproxy-grafana.md index 80200ff7bb82b..ec4375a0e080f 100644 --- a/tiproxy/tiproxy-grafana.md +++ b/tiproxy/tiproxy-grafana.md @@ -21,27 +21,42 @@ TiProxy has four panel groups. The metrics on these panels indicate the current ## Server -- CPU Usage: The CPU utilization per TiProxy instance. -- Memory Usage: The memory usage per TiProxy instance. -- Uptime: The runtime of TiProxy since last restart. -- Goroutine Count: Running goroutine count of TiProxy instance. -- Connection Count: SQL connections that TiProxy instance serve. +- CPU Usage: the CPU utilization of each TiProxy instance +- Memory Usage: the memory usage of each TiProxy instance +- Uptime: the runtime of each TiProxy instance since last restart +- Connection Count: the number of clients connected to each TiProxy instance +- Create Connection OPM: the number of creating connections on each TiProxy instance every minute +- Disconnection OPM: the number of disconnections for each reason every minute. Reasons include: + - success: the client disconnects normally + - client network break: the client does not send a `QUIT` command before it disconnects. It may also be caused by a network problem or the client shutting down + - client handshake fail: the client fails to handshake with TiProxy + - auth fail: the access is denied by TiDB + - SQL error: TiDB returns other SQL errors + - proxy shutdown: TiProxy is shutting down + - malformed packet: TiProxy fails to parse the MySQL packet + - get backend fail: TiProxy fails to find an available backend for the connection + - proxy error: other TiProxy errors + - backend network break: fails to read from or write to the TiDB. This may be caused by a network problem or the TiDB server shutting down + - backend handshake fail: TiProxy fails to handshake with the TiDB server +- Goroutine Count: the number of Goroutines on each TiProxy instance ## Query-Summary -- Duration: average, p95, p99 SQL statement execution duration. -- CPS by Instance: command per second of all TiProxy instances. -- CPS by Backend: command per second of all TiDB instances. -- CPS by CMD: command per second grouped by SQL command type. +- Duration: average, p95, p99 SQL statement execution duration. It includes the duration of SQL statement execution on TiDB servers, so it is higher than the duration on the TiDB Grafana panel +- P99 Duration By Instance: p99 statement execution duration of each TiProxy instance +- P99 Duration By Backend: p99 statement execution duration of the statements that are executed on each TiDB instance +- CPS by Instance: command per second of each TiProxy instance +- CPS by Backend: command per second of each TiDB instance +- CPS by CMD: command per second grouped by SQL command type ## Balance -- Backend Connections: connection counts between each TiDB instance and each TiProxy instance. -- Session Migrations: session migrations happened on all TiProxy instances, recording sessions on which TiDB instance migrated to the other. +- Backend Connections: connection counts between each TiDB instance and each TiProxy instance. For example, `10.24.31.1:6000 | 10.24.31.2:4000` indicates the connections between TiProxy instance `10.24.31.1:6000` and TiDB instance `10.24.31.2:4000` +- Session Migration OPM: the number of session migrations that happened every minute, recording sessions on which TiDB instance migrated to the other. For example, `succeed: 10.24.31.2:4000 => 10.24.31.3:4000` indicates the number of sessions that are successfully migrated from TiDB instance `10.24.31.2:4000` to TiDB instance `10.24.31.3:4000` - Session Migration Duration: average, p95, p99 session migration duration. ## Backend -- Get Backend Count: how many times did TiProxy instances try to connect the backend. -- Get Backend Duration: average, p95, p99 duration of connecting backend, useful for debugging why TiProxy can not establish healthy connections. -- Ping Backend Duration: latencies between each TiDB instance and each TiProxy instance. +- Get Backend Duration: the average, p95, p99 duration of TiProxy connecting to a TiDB instance +- Ping Backend Duration: the network latency between each TiProxy instance and each TiProxy instance. For example, `10.24.31.1:6000 | 10.24.31.2:4000` indicates the network latency between TiProxy instance `10.24.31.1:6000` and TiDB instance `10.24.31.2:4000` +- Health Check Cycle: the duration of a cycle of the health check between a TiProxy instance and all TiDB instances. For example, `10.24.31.1:6000` indicates the duration of the latest health check that TiProxy instance `10.24.31.1:6000` executes on all the TiDB instances. If this duration is higher than 3 seconds, TiProxy may not be timely to refresh the backend TiDB list