Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documents for information_schema.tidb_index_usage and sys.schema_unused_index. #16511

Merged
merged 6 commits into from
Mar 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -961,6 +961,7 @@
- [`TIDB_HOT_REGIONS`](/information-schema/information-schema-tidb-hot-regions.md)
- [`TIDB_HOT_REGIONS_HISTORY`](/information-schema/information-schema-tidb-hot-regions-history.md)
- [`TIDB_INDEXES`](/information-schema/information-schema-tidb-indexes.md)
- [`TIDB_INDEX_USAGE`](/information-schema/information-schema-tidb-index-usage.md)
- [`TIDB_SERVERS_INFO`](/information-schema/information-schema-tidb-servers-info.md)
- [`TIDB_TRX`](/information-schema/information-schema-tidb-trx.md)
- [`TIFLASH_REPLICA`](/information-schema/information-schema-tiflash-replica.md)
Expand All @@ -977,6 +978,7 @@
- PERFORMANCE_SCHEMA
- [概述](/performance-schema/performance-schema.md)
- [`SESSION_CONNECT_ATTRS`](/performance-schema/performance-schema-session-connect-attrs.md)
- [`SYS`](/sys-schema.md)
- [元数据锁](/metadata-lock.md)
- [TiDB DDL V2](/ddl-v2.md)
- UI
Expand Down
93 changes: 93 additions & 0 deletions information-schema/information-schema-tidb-index-usage.md
YangKeao marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
---
title: TIDB_INDEX_USAGE
summary: 了解 INFORMATION_SCHEMA 表 `TIDB_INDEX_USAGE`。
---

# TIDB_INDEX_USAGE

TiDB 从 v8.0.0 开始提供 `TIDB_INDEX_USAGE` 表,你可以使用该表查看当前 TiDB 节点中所有索引的访问统计信息。在 SQL 语句执行时,TiDB 默认维护访问索引有关的统计信息,可以通过修改配置项 [`instance.tidb_enable_collect_execution_info`](/tidb-configuration-file.md#tidb_enable_collect_execution_info) 或者系统变量 [`tidb_enable_collect_execution_info`](/system-variables.md#tidb_enable_collect_execution_info) 将其关闭。

```sql
USE INFORMATION_SCHEMA;
DESC TIDB_INDEX_USAGE;
```

```sql
+--------------------------+-------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------+-------------+------+------+---------+-------+
| TABLE_SCHEMA | varchar(64) | YES | | NULL | |
| TABLE_NAME | varchar(64) | YES | | NULL | |
| INDEX_NAME | varchar(64) | YES | | NULL | |
| QUERY_TOTAL | bigint(21) | YES | | NULL | |
| KV_REQ_TOTAL | bigint(21) | YES | | NULL | |
| ROWS_ACCESS_TOTAL | bigint(21) | YES | | NULL | |
| PERCENTAGE_ACCESS_0 | bigint(21) | YES | | NULL | |
| PERCENTAGE_ACCESS_0_1 | bigint(21) | YES | | NULL | |
| PERCENTAGE_ACCESS_1_10 | bigint(21) | YES | | NULL | |
| PERCENTAGE_ACCESS_10_20 | bigint(21) | YES | | NULL | |
| PERCENTAGE_ACCESS_20_50 | bigint(21) | YES | | NULL | |
| PERCENTAGE_ACCESS_50_100 | bigint(21) | YES | | NULL | |
| PERCENTAGE_ACCESS_100 | bigint(21) | YES | | NULL | |
| LAST_ACCESS_TIME | datetime | YES | | NULL | |
+--------------------------+-------------+------+------+---------+-------+
14 rows in set (0.00 sec)
```

`TIDB_INDEX_USAGE` 表中列的含义如下:

* `TABLE_SCHEMA`:索引所在表的所属数据库的名称。
* `TABLE_NAME`:索引所在表的名称。
* `INDEX_NAME`:索引的名称。
* `QUERY_TOTAL`:访问该索引的语句总数。
* `KV_REQ_TOTAL`:访问该索引时产生的 KV 请求总数。
* `ROWS_ACCESS_TOTAL`:访问该索引时扫描的总行数。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对于大表,这个字段一直累加会不会有溢出的风险?需要特殊说明吗

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

假设集群每秒钟总共扫描 0xFFFFFFFF 行,uint64 在 0xFFFFFFFF s = 136.19252 年的时候才会溢出,现在看应该没啥风险 🤔 。

* `PERCENTAGE_ACCESS_0`:行访问比例(访问行数占表总行数的百分比)为 0 的次数。
* `PERCENTAGE_ACCESS_0_1`:行访问比例为 0 到 1% 的次数。
* `PERCENTAGE_ACCESS_1_10`:行访问比例为 1% 到 10% 的次数。
* `PERCENTAGE_ACCESS_10_20`:行访问比例为 10% 到 20% 的次数。
* `PERCENTAGE_ACCESS_20_50`:行访问比例为 20% 到 50% 的次数。
* `PERCENTAGE_ACCESS_50_100`:行访问比例为 50% 到 100% 的次数。
* `PERCENTAGE_ACCESS_100`:行访问比例为 100% 的次数。
* `LAST_ACCESS_TIME`:最近一次访问该索引的时间。

## CLUSTER_TIDB_INDEX_USAGE

`TIDB_INDEX_USAGE` 表仅提供单个 TiDB 节点中所有索引的访问统计信息。如果要查看整个集群上所有 TiDB 节点中索引的访问统计信息,需要查询 `CLUSTER_TIDB_INDEX_USAGE` 表。

与 `TIDB_INDEX_USAGE` 表的查询结果相比,`CLUSTER_TIDB_INDEX_USAGE` 表的查询结果额外包含了 `INSTANCE` 字段。`INSTANCE` 字段展示了集群中各节点的 IP 地址和端口,用于区分不同节点上的统计信息。

```sql
USE INFORMATION_SCHEMA;
DESC CLUSTER_TIDB_INDEX_USAGE;
```

输出结果如下:

```sql
+-------------------------+-----------------------------------------------------------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------------+-----------------------------------------------------------------+------+------+---------+-------+
| INSTANCE | varchar(64) | YES | | NULL | |
| ID | bigint(21) unsigned | NO | PRI | NULL | |
| START_TIME | timestamp(6) | YES | | NULL | |
| CURRENT_SQL_DIGEST | varchar(64) | YES | | NULL | |
| CURRENT_SQL_DIGEST_TEXT | text | YES | | NULL | |
| STATE | enum('Idle','Running','LockWaiting','Committing','RollingBack') | YES | | NULL | |
| WAITING_START_TIME | timestamp(6) | YES | | NULL | |
| MEM_BUFFER_KEYS | bigint(64) | YES | | NULL | |
| MEM_BUFFER_BYTES | bigint(64) | YES | | NULL | |
| SESSION_ID | bigint(21) unsigned | YES | | NULL | |
| USER | varchar(16) | YES | | NULL | |
| DB | varchar(64) | YES | | NULL | |
| ALL_SQL_DIGESTS | text | YES | | NULL | |
| RELATED_TABLE_IDS | text | YES | | NULL | |
| WAITING_TIME | double | YES | | NULL | |
+-------------------------+-----------------------------------------------------------------+------+------+---------+-------+
15 rows in set (0.00 sec)
```

## 使用限制

- `TIDB_INDEX_USAGE` 表中的数据可能存在最多 5 分钟的延迟。
- 在 TiDB 重启后,`TIDB_INDEX_USAGE` 表中的数据会被清空。
2 changes: 2 additions & 0 deletions information-schema/information-schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ Information Schema 提供了一种查看系统元数据的 ANSI 标准方法。
| `CLUSTER_SLOW_QUERY` | 提供 `SLOW_QUERY` 表的集群级别的视图。 |
| `CLUSTER_STATEMENTS_SUMMARY` | 提供 `STATEMENTS_SUMMARY` 表的集群级别的视图。 |
| `CLUSTER_STATEMENTS_SUMMARY_HISTORY` | 提供 `STATEMENTS_SUMMARY_HISTORY` 表的集群级别的视图。 |
| `CLUSTER_TIDB_INDEX_USAGE` | 提供 `TIDB_INDEX_USAGE` 表的集群级别的视图。 |
| `CLUSTER_TIDB_TRX` | 提供 `TIDB_TRX` 表的集群级别的视图。 |
| [`CLUSTER_SYSTEMINFO`](/information-schema/information-schema-cluster-systeminfo.md) | 提供集群中服务器的内核参数配置的详细信息。 |
| [`DATA_LOCK_WAITS`](/information-schema/information-schema-data-lock-waits.md) | 提供 TiKV 服务器上的等锁信息。 |
Expand All @@ -92,6 +93,7 @@ Information Schema 提供了一种查看系统元数据的 ANSI 标准方法。
| [`TIDB_HOT_REGIONS`](/information-schema/information-schema-tidb-hot-regions.md) | 提供有关哪些 Region 访问次数最多的统计信息。 |
| [`TIDB_HOT_REGIONS_HISTORY`](/information-schema/information-schema-tidb-hot-regions-history.md)| 提供有关哪些 Region 访问次数最多的历史统计信息。 |
| [`TIDB_INDEXES`](/information-schema/information-schema-tidb-indexes.md) | 提供有关 TiDB 表的索引信息。 |
| [`TIDB_INDEX_USAGE`](/information-schema/information-schema-tidb-index-usage.md) | 提供 TiDB 节点上有关访问索引的统计信息。 |
| [`TIDB_SERVERS_INFO`](/information-schema/information-schema-tidb-servers-info.md) | 提供 TiDB 服务器的列表 |
| [`TIDB_TRX`](/information-schema/information-schema-tidb-trx.md) | 提供 TiDB 节点上正在执行的事务的信息。 |
| [`TIFLASH_REPLICA`](/information-schema/information-schema-tiflash-replica.md) | 提供有关 TiFlash 副本的详细信息。 |
Expand Down
54 changes: 54 additions & 0 deletions sys-schema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
title: sys Schema
summary: 了解 TiDB `sys` 系统数据库。
---

# `sys` Schema

TiDB 从 v8.0.0 开始提供 `sys` Schema。你可以通过查看 `sys` 系统数据库中的表或视图理解 TiDB 的系统表、[`INFORMATION_SCHEMA`](/information-schema/information-schema.md) 表和 [`PERFORMANCE SCHEMA`](/performance-schema/performance-schema.md) 表内的数据。

## 手动创建 `sys` Schema 和视图

Oreoxmt marked this conversation as resolved.
Show resolved Hide resolved
对于从 v8.0.0 之前版本升级的集群,`sys` Schema 和其中的视图不会自动创建。你可以通过以下 SQL 语句手动创建:

```sql
CREATE DATABASE IF NOT EXISTS sys;
CREATE OR REPLACE VIEW sys.schema_unused_indexes AS
SELECT
table_schema as object_schema,
table_name as object_name,
index_name
FROM information_schema.cluster_tidb_index_usage
WHERE
table_schema not in ('sys', 'mysql', 'INFORMATION_SCHEMA', 'PERFORMANCE_SCHEMA') and
index_name != 'PRIMARY'
GROUP BY table_schema, table_name, index_name
HAVING
sum(last_access_time) is null;
```

## `schema_unused_index`

`schema_unused_index` 用于记录自 TiDB 上次启动以来未被使用的索引信息,包括如下列:

- `OBJECT_SCHEMA`:索引所在表的所属数据库的名称。
- `OBJECT_NAME`:索引所在表的名称。
- `INDEX_NAME`:索引的名称。

```sql
USE SYS;
DESC SCHEMA_UNUSED_INDEXES;
```

输出结果如下:

```sql
+---------------+-------------+------+------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-------------+------+------+---------+-------+
| object_schema | varchar(64) | YES | | NULL | |
| object_name | varchar(64) | YES | | NULL | |
| index_name | varchar(64) | YES | | NULL | |
+---------------+-------------+------+------+---------+-------+
3 rows in set (0.00 sec)
```
2 changes: 1 addition & 1 deletion system-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -1719,7 +1719,7 @@ mysql> SELECT job_info FROM mysql.analyze_jobs ORDER BY end_time DESC LIMIT 1;
- 是否受 Hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value) 控制:否
- 类型:布尔型
- 默认值:`ON`
- 这个变量用于控制是否同时将各个执行算子的执行信息记录入 slow query log 中。
- 这个变量用于控制是否同时将各个执行算子的执行信息记录入 slow query log 中,以及是否维护[访问索引有关的统计信息](/information-schema/information-schema-tidb-index-usage.md)

### `tidb_enable_column_tracking` <span class="version-mark">从 v5.4.0 版本开始引入</span>

Expand Down
2 changes: 1 addition & 1 deletion tidb-configuration-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -829,7 +829,7 @@ TiDB 服务状态相关配置。

### `tidb_enable_collect_execution_info`

+ 用于控制是否同时将各个执行算子的执行信息记录入 slow query log 中。
+ 用于控制是否同时将各个执行算子的执行信息记录入 slow query log 中,以及是否维护[访问索引有关的统计信息](/information-schema/information-schema-tidb-index-usage.md)
+ 默认值:true
+ 在 v6.1.0 之前,该功能通过配置项 `enable-collect-execution-info` 进行设置。

Expand Down
Loading