From 4fd9607d18d1918d12b00f705ad814cf1447f64c Mon Sep 17 00:00:00 2001 From: "mergify[bot]" <37929162+mergify[bot]@users.noreply.github.com> Date: Fri, 6 Sep 2024 07:31:24 +0000 Subject: [PATCH] [Doc] Add Dict Obj Docs (backport #50788) (#50791) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: 絵空事スピリット --- .../dict-functions/dictionary_get.md | 111 ++++++++++++ .../dictionary/CANCEL_REFRESH_DICTIONARY.md | 28 +++ .../dictionary/CREATE_DICTIONARY.md | 160 ++++++++++++++++++ .../dictionary/DROP_DICTIONARY.md | 63 +++++++ .../dictionary/REFRESH_DICTIONARY.md | 28 +++ .../dictionary/SHOW_DICTIONARY.md | 76 +++++++++ .../dict-functions/dictionary_get.md | 111 ++++++++++++ .../dictionary/CANCEL_REFRESH_DICTIONARY.md | 28 +++ .../dictionary/CREATE_DICTIONARY.md | 159 +++++++++++++++++ .../dictionary/DROP_DICTIONARY.md | 63 +++++++ .../dictionary/REFRESH_DICTIONARY.md | 28 +++ .../dictionary/SHOW_DICTIONARY.md | 76 +++++++++ 12 files changed, 931 insertions(+) create mode 100644 docs/en/sql-reference/sql-functions/dict-functions/dictionary_get.md create mode 100644 docs/en/sql-reference/sql-statements/dictionary/CANCEL_REFRESH_DICTIONARY.md create mode 100644 docs/en/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md create mode 100644 docs/en/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md create mode 100644 docs/en/sql-reference/sql-statements/dictionary/REFRESH_DICTIONARY.md create mode 100644 docs/en/sql-reference/sql-statements/dictionary/SHOW_DICTIONARY.md create mode 100644 docs/zh/sql-reference/sql-functions/dict-functions/dictionary_get.md create mode 100644 docs/zh/sql-reference/sql-statements/dictionary/CANCEL_REFRESH_DICTIONARY.md create mode 100644 docs/zh/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md create mode 100644 docs/zh/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md create mode 100644 docs/zh/sql-reference/sql-statements/dictionary/REFRESH_DICTIONARY.md create mode 100644 docs/zh/sql-reference/sql-statements/dictionary/SHOW_DICTIONARY.md diff --git a/docs/en/sql-reference/sql-functions/dict-functions/dictionary_get.md b/docs/en/sql-reference/sql-functions/dict-functions/dictionary_get.md new file mode 100644 index 0000000000000..51fc00a03f6f6 --- /dev/null +++ b/docs/en/sql-reference/sql-functions/dict-functions/dictionary_get.md @@ -0,0 +1,111 @@ +--- +displayed_sidebar: docs +--- + +# dictionary_get + + + +Query the value mapped to the key in a dictionary object. + +## Syntax + +```SQL +dictionary_get('dictionary_object_name', key_expression_list, [NULL_IF_NOT_EXIST]) + +key_expression_list ::= + key_expression [, ...] + +key_expression ::= + column_name | const_value +``` + +## Parameters + +- `dictionary_name`: The name of the dictionary object. +- `key_expression_list`: A list of expressions for all key columns. It can be a list of column names or a list of values. +- `NULL_IF_NOT_EXIST` (Optional): Whether to return if the key does not exist in the dictionary cache. Valid values: + - `true`: Null is returned if the key does not exist. + - `false` (Default): An exception is thrown if the key does not exist. + +## Returns + +Returns the values of value columns as a STRUCT type. Therefore, you can use `[N]` or `.` to specify a particular column's value. `N` represents the column's position, starting from 1. + +## Examples + +The following examples uses the dataset from the examples of [dict_mapping](dict_mapping.md). + +- Example 1: Query the values of the value column mapped to the key column `order_uuid` in the dictionary object `dict_obj`. + + ```Plain + MySQL > SELECT dictionary_get('dict_obj', order_uuid) FROM dict; + +--------------------+ + | DICTIONARY_GET | + +--------------------+ + | {"order_id_int":1} | + | {"order_id_int":3} | + | {"order_id_int":2} | + +--------------------+ + 3 rows in set (0.02 sec) + ``` + +- Example 2: Query the value of the value column mapped to key `a1` in the dictionary object `dict_obj`. + + ```Plain + MySQL > SELECT dictionary_get("dict_obj", "a1"); + +--------------------+ + | DICTIONARY_GET | + +--------------------+ + | {"order_id_int":1} | + +--------------------+ + 1 row in set (0.01 sec) + ``` + +- Example 3: Query the values of the value columns mapped to key `1` in the dictionary object `dimension_obj`. + + ```Plain + MySQL > SELECT dictionary_get("dimension_obj", 1); + +-----------------------------------------------------------------------------------------------------------------+ + | DICTIONARY_GET | + +-----------------------------------------------------------------------------------------------------------------+ + | {"ProductName":"T-Shirt","Category":"Apparel","SubCategory":"Shirts","Brand":"BrandA","Color":"Red","Size":"M"} | + +-----------------------------------------------------------------------------------------------------------------+ + 1 row in set (0.01 sec) + ``` + +- Example 4: Query the value of the first value column mapped to key `1` in the dictionary object `dimension_obj`. + + ```Plain + MySQL > SELECT dictionary_get("dimension_obj", 1)[1]; + +-------------------+ + | DICTIONARY_GET[1] | + +-------------------+ + | T-Shirt | + +-------------------+ + 1 row in set (0.01 sec) + ``` + +- Example 5: Query the value of the second value column mapped to key `1` in the dictionary object `dimension_obj`. + + ```Plain + MySQL > SELECT dictionary_get("dimension_obj", 1)[2]; + +-------------------+ + | DICTIONARY_GET[2] | + +-------------------+ + | Apparel | + +-------------------+ + 1 row in set (0.01 sec) + ``` + +- Example 6: Query the value of `ProductName` value column mapped to key `1` in the dictionary object `dimension_obj`. + + ```Plain + MySQL > SELECT dictionary_get("dimension_obj", 1).ProductName; + +----------------------------+ + | DICTIONARY_GET.ProductName | + +----------------------------+ + | T-Shirt | + +----------------------------+ + 1 row in set (0.01 sec) + ``` \ No newline at end of file diff --git a/docs/en/sql-reference/sql-statements/dictionary/CANCEL_REFRESH_DICTIONARY.md b/docs/en/sql-reference/sql-statements/dictionary/CANCEL_REFRESH_DICTIONARY.md new file mode 100644 index 0000000000000..70d7a42072fee --- /dev/null +++ b/docs/en/sql-reference/sql-statements/dictionary/CANCEL_REFRESH_DICTIONARY.md @@ -0,0 +1,28 @@ +--- +displayed_sidebar: docs +--- + +# CANCEL REFRESH DICTIONARY + + + +Cancels the refresh of a dictionary object. + +## Syntax + +```SQL +CANCEL REFRESH DICTIONARY +``` + +## Parameters + +- **dictionary_object_name**: The name of the dictionary object that is in the REFRESHING state. + +## Examples + +Cancel the refresh of the dictionary object `dict_obj`. + +```Plain +MySQL > CANCEL REFRESH DICTIONARY dict_obj; +Query OK, 0 rows affected (0.01 sec) +``` \ No newline at end of file diff --git a/docs/en/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md b/docs/en/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md new file mode 100644 index 0000000000000..99c474f949ebb --- /dev/null +++ b/docs/en/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md @@ -0,0 +1,160 @@ +--- +displayed_sidebar: docs +--- + +# CREATE DICTIONARY + + + +Creates a dictionary object based on an original object. The dictionary object organizes the key-value mappings from the original object in the form of a hash table and is cached in the memory of all BE nodes. It can be viewed as a cached table. + +**Advantages** + +- **Richer original objects for dictionary objects**: When using `dictionary_get()` to query dictionary objects, the original object can be a table of any type, asynchronous materialized view, or logical view. However, when using `dict_mapping()` to query dictionary tables, the dictionary tables can only be primary key tables. +- **Fast query speed**: Since the dictionary object is a hash table and fully cached in the memory of all BE nodes, querying the dictionary object to get the mapping is realized by looking up the hash table in memory. Therefore, the query speed is very fast. +- **Supports multiple value columns**: Internally, the dictionary object encodes multiple value columns into a single STRUCT type column. For queries based on a key, multiple values are returned together. Therefore, the dictionary object can serve as a dimension table where each key (usually a unique identifier) corresponds to multiple values (descriptive attributes). +- **Ensures consistent snapshot reads**: The dictionary snapshot obtained within the same transaction is consistent, ensuring that the query results from the dictionary object do not change during the same query or load process. + +## Syntax + +```SQL +CREATE DICTIONARY USING +( + column_name KEY, [..., column_name KEY,] + column_name VALUE[, ..., column_name VALUE] +) +[PROPERTIES ("key"="value", ...)]; +``` + +## Parameters + +- `dictionary_object_name`: The name of the dictionary object. The dictionary object is effective globally and does not belong to a specific database. +- `dictionary_source`: The name of the original object on which the dictionary object is based. The original object can be a table of any type, asynchronous materialized view, or logical view. +- Definition of columns in the dictionary object: To preserve the key-value mapping maintained in the dictionary table, you need to use the `KEY` and `VALUE` keywords in the dictionary object's columns to specify the keys and their mapped values. + - The column names `column_name` in the dictionary object must be consistent with those in the dictionary table. + - The data types for key and value columns in the dictionary object are limited to boolean, integer, string, and date types. + - The key column in the original object must ensure uniqueness. +- Related properties of dictionary objects (`PROPERTIES`): + - `dictionary_warm_up`: The method to cache data into the dictionary object on each BE node. Valid values: `TRUE` (default) or `FALSE`. If the parameter is set to `TRUE`, data is automatically cached into the dictionary object after its creation; if the parameter is set to `FALSE`, you need to manually refresh the dictionary object to cache the data. + - `dictionary_memory_limit`: The maximum memory the dictionary object can occupy on each BE node. Unit: bytes. Default value: 2,000,000,000 bytes (2 GB). + - `dictionary_refresh_interval`: The interval for periodically refreshing the dictionary object. Unit: seconds. Default value: `0`. A value `<=0` means no automatic refresh. + - `dictionary_read_latest`: Whether to only query the latest dictionary object, mainly affecting the dictionary object queried during refresh. Valid values: `TRUE` or `FALSE` (default). If the parameter is set to `TRUE`, the dictionary object cannot be queried during refresh because the latest dictionary object is still being refreshed. If the parameter is set to `FALSE`, the previously successfully cached dictionary object can be queried during refresh. + - `dictionary_ignore_failed_refresh`: Whether to automatically roll back to the last successfully cached dictionary object if the refresh fails. Valid values: `TRUE` or `FALSE` (default). If the parameter is set to `TRUE`, it automatically rolls back to the last successfully cached dictionary object when the refresh fails. If the parameter is set to `FALSE`, the dictionary object status is set to `CANCELLED` when the refresh fails. + +## Usage notes + +- The dictionary object is fully cached in the memory of each BE node, so it consumes relatively more memory. +- Even if the original object is deleted, the dictionary object created based on it still exists. You need to manually DROP the dictionary object. + +## Examples + +**Example 1: Create a simple dictionary object to replace the original dictionary table.** + +Take the following dictionary table as an example and insert test data. + +```Plain +MySQL > CREATE TABLE dict ( + order_uuid STRING, + order_id_int BIGINT AUTO_INCREMENT +) +PRIMARY KEY (order_uuid) +DISTRIBUTED BY HASH (order_uuid); +Query OK, 0 rows affected (0.02 sec) +MySQL > INSERT INTO dict (order_uuid) VALUES ('a1'), ('a2'), ('a3'); +Query OK, 3 rows affected (0.12 sec) +{'label':'insert_9e60b0e4-89fa-11ee-a41f-b22a2c00f66b', 'status':'VISIBLE', 'txnId':'15029'} +MySQL > SELECT * FROM dict; ++------------+--------------+ +| order_uuid | order_id_int | ++------------+--------------+ +| a1 | 1 | +| a2 | 2 | +| a3 | 3 | ++------------+--------------+ +3 rows in set (0.01 sec) +``` + +Create a dictionary object based on the mappings in this dictionary table. + +```Plain +MySQL > CREATE DICTIONARY dict_obj USING dict + (order_uuid KEY, + order_id_int VALUE); +Query OK, 0 rows affected (0.00 sec) +``` + +For future queries of the mappings in the dictionary table, you can directly query the dictionary object instead of the dictionary table. For example, query the value mapped by key `a1`. + +```Plain +MySQL > SELECT dictionary_get("dict_obj", "a1"); ++--------------------+ +| DICTIONARY_GET | ++--------------------+ +| {"order_id_int":1} | ++--------------------+ +1 row in set (0.01 sec) +``` + +**Example 2: Create a dictionary object to replace the original dimension table** + +Take the following dimension table as an example and insert test data. + +```Plain +MySQL > CREATE TABLE ProductDimension ( + ProductKey BIGINT AUTO_INCREMENT, + ProductName VARCHAR(100) NOT NULL, + Category VARCHAR(50), + SubCategory VARCHAR(50), + Brand VARCHAR(50), + Color VARCHAR(20), + Size VARCHAR(20) +) +PRIMARY KEY (ProductKey) +DISTRIBUTED BY HASH (ProductKey); +MySQL > INSERT INTO ProductDimension (ProductName, Category, SubCategory, Brand, Color, Size) +VALUES + ('T-Shirt', 'Apparel', 'Shirts', 'BrandA', 'Red', 'M'), + ('Jeans', 'Apparel', 'Pants', 'BrandB', 'Blue', 'L'), + ('Running Shoes', 'Footwear', 'Athletic', 'BrandC', 'Black', '10'), + ('Jacket', 'Apparel', 'Outerwear', 'BrandA', 'Green', 'XL'), + ('Baseball Cap', 'Accessories', 'Hats', 'BrandD', 'White', 'OneSize'); +Query OK, 5 rows affected (0.48 sec) +{'label':'insert_e938481f-181e-11ef-a6a9-00163e19e14e', 'status':'VISIBLE', 'txnId':'50'} +MySQL > SELECT * FROM ProductDimension; ++------------+---------------+-------------+-------------+--------+-------+---------+ +| ProductKey | ProductName | Category | SubCategory | Brand | Color | Size | ++------------+---------------+-------------+-------------+--------+-------+---------+ +| 1 | T-Shirt | Apparel | Shirts | BrandA | Red | M | +| 2 | Jeans | Apparel | Pants | BrandB | Blue | L | +| 3 | Running Shoes | Footwear | Athletic | BrandC | Black | 10 | +| 4 | Jacket | Apparel | Outerwear | BrandA | Green | XL | +| 5 | Baseball Cap | Accessories | Hats | BrandD | White | OneSize | ++------------+---------------+-------------+-------------+--------+-------+---------+ +5 rows in set (0.02 sec) +``` + +Create a dictionary object to replace the original dimension table. + +```Plain +MySQL > CREATE DICTIONARY dimension_obj USING ProductDimension + (ProductKey KEY, + ProductName VALUE, + Category VALUE, + SubCategory VALUE, + Brand VALUE, + Color VALUE, + Size VALUE); +Query OK, 0 rows affected (0.00 sec) +``` + +For future queries of dimension values, you can directly query the dictionary object instead of the dimension table to obtain dimension values. For example, query the value mapped by key `1`. + +```Plain +MySQL > SELECT dictionary_get("dict_obj", "a1"); ++--------------------+ +| DICTIONARY_GET | ++--------------------+ +| {"order_id_int":1} | ++--------------------+ +1 row in set (0.01 sec) +``` \ No newline at end of file diff --git a/docs/en/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md b/docs/en/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md new file mode 100644 index 0000000000000..d8c06f0027cf2 --- /dev/null +++ b/docs/en/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md @@ -0,0 +1,63 @@ +--- +displayed_sidebar: docs +--- + +# DROP DICTIONARY + + + +Delete a dictionary object or clear the cached data within a dictionary object. + +## Syntax + +```SQL +DROP DICTIONARY [ CACHE ] +``` + +## Parameters + +- `dictionary_object_name`: The name of the dictionary object. +- `CACHE`: If the keyword `CACHE` is specified, only the cached data within the dictionary object will be cleared. To restore the cached data later, you can manually refresh it. If the keyword `CACHE` is not specified, the dictionary object will be deleted. + +## Examples + +- Example 1: Clear only the cached data within the dictionary object. + +```Plain +DROP DICTIONARY dict_obj CACHE; +``` + + The dictionary object still exists. + +```Plain +MySQL > SHOW DICTIONARY dict_obj\G +*************************** 1. row *************************** + DictionaryId: 5 + DictionaryName: dict_obj + DbName: example_db + dictionaryObject: dict + dictionaryKeys: [order_uuid] + dictionaryValues: [order_id_int] + status: UNINITIALIZED + lastSuccessRefreshTime: 2024-05-24 12:59:10 + lastSuccessFinishedTime: 2024-05-24 12:59:20 + nextSchedulableTime: disable auto schedule for refreshing + ErrorMessage: +approximated dictionaryMemoryUsage (Bytes): 172.26.80.55:8060 : 0 + 172.26.80.56:8060 : 0 + 172.26.80.57:8060 : 0 +1 row in set (0.00 sec) +``` + +- Example 2: Delete the dictionary object `dict_obj`. + +```Plain +DROP DICTIONARY dict_obj; +``` + + The dictionary object is completely deleted and no longer exists. + +```Plain +MySQL > SHOW DICTIONARY dict_obj; +Empty set (0.00 sec) +``` \ No newline at end of file diff --git a/docs/en/sql-reference/sql-statements/dictionary/REFRESH_DICTIONARY.md b/docs/en/sql-reference/sql-statements/dictionary/REFRESH_DICTIONARY.md new file mode 100644 index 0000000000000..a04a147e6485a --- /dev/null +++ b/docs/en/sql-reference/sql-statements/dictionary/REFRESH_DICTIONARY.md @@ -0,0 +1,28 @@ +--- +displayed_sidebar: docs +--- + +# REFRESH DICTIONARY + + + +Manually refreshes a dictionary object. Internally, the system will query the latest data from the original object and write it into the dictionary object. + +## Syntax + +```SQL +REFRESH DICTIONARY +``` + +## Parameters + +- **dictionary_object_name**: The name of the dictionary object. + +## Examples + +Manually refresh the dictionary object `dict_obj`. + +```Plain +MySQL > REFRESH DICTIONARY dict_obj; +Query OK, 0 rows affected (0.01 sec) +``` \ No newline at end of file diff --git a/docs/en/sql-reference/sql-statements/dictionary/SHOW_DICTIONARY.md b/docs/en/sql-reference/sql-statements/dictionary/SHOW_DICTIONARY.md new file mode 100644 index 0000000000000..652b14e9883ea --- /dev/null +++ b/docs/en/sql-reference/sql-statements/dictionary/SHOW_DICTIONARY.md @@ -0,0 +1,76 @@ +--- +displayed_sidebar: docs +--- + +# SHOW DICTIONARY + + + +Shows information about dictionary objects. + +## Syntax + +```SQL +SHOW DICTIONARY [ ] +``` + +## Parameters + +- `dictionary_object_name`: The name of the dictionary object. + +## Returns + +- `DictionaryId`: The unique ID of each dictionary object. +- `DictionaryName`: The name of the dictionary object. +- `DbName`: The database to which the original object belongs. +- `dictionarySource`: The name of the original object. +- `dictionaryKeys`: The key columns of the dictionary object. +- `dictionaryValues`: The value columns of the dictionary object. +- `state`: The refresh state of the dictionary object. + - `UNINITIALIZED`: The dictionary object has just been created and has not yet been refreshed. + - `REFRESHING`: The dictionary object is currently being refreshed. + - `COMMITTING`: Data has been cached in each BE node's dictionary object, and some final tasks are being performed, such as updating metadata. + - `FINISHED`: The refresh was successful and has ended. + - `CANCELLED`: An error occurred during the refresh process, and the refresh was canceled. You need to fix the issue based on the error information in `ErrorMessage`, and then refresh the dictionary object again. +- `lastSuccessRefreshTime`: The start time of the last successful refresh of the dictionary object. +- `lastSuccessFinishedTime`: The end time of the last successful refresh of the dictionary object. +- `nextSchedulableTime`: The next time the dictionary object data will be automatically refreshed. +- `ErrorMessage`: Error message when refreshing the dictionary object fails. +- `approximated dictionaryMemoryUsage (Bytes)`: The estimated memory usage of the dictionary object cached on each BE node. + +## Examples + +```Plain +MySQL > SHOW DICTIONARY\G +*************************** 1. row *************************** + DictionaryId: 3 + DictionaryName: dict_obj + DbName: example_db + dictionaryObject: dict + dictionaryKeys: [order_uuid] + dictionaryValues: [order_id_int] + status: FINISHED + lastSuccessRefreshTime: 2024-05-23 16:12:03 + lastSuccessFinishedTime: 2024-05-23 16:12:12 + nextSchedulableTime: disable auto schedule for refreshing + ErrorMessage: +approximated dictionaryMemoryUsage (Bytes): 172.26.82.208:8060 : 30 + 172.26.82.210:8060 : 30 + 172.26.82.209:8060 : 30 +*************************** 2. row *************************** + DictionaryId: 4 + DictionaryName: dimension_obj + DbName: example_db + dictionaryObject: ProductDimension + dictionaryKeys: [ProductKey] + dictionaryValues: [ProductName, Category, SubCategory, Brand, Color, Size] + status: FINISHED + lastSuccessRefreshTime: 2024-05-23 16:12:41 + lastSuccessFinishedTime: 2024-05-23 16:12:42 + nextSchedulableTime: disable auto schedule for refreshing + ErrorMessage: +approximated dictionaryMemoryUsage (Bytes): 172.26.82.208:8060 : 270 + 172.26.82.210:8060 : 270 + 172.26.82.209:8060 : 270 +2 rows in set (0.00 sec) +``` \ No newline at end of file diff --git a/docs/zh/sql-reference/sql-functions/dict-functions/dictionary_get.md b/docs/zh/sql-reference/sql-functions/dict-functions/dictionary_get.md new file mode 100644 index 0000000000000..5921e2b339697 --- /dev/null +++ b/docs/zh/sql-reference/sql-functions/dict-functions/dictionary_get.md @@ -0,0 +1,111 @@ +--- +displayed_sidebar: docs +--- + +# dictionary_get + + + +查询字典对象中 key 映射的 value。 + +## 语法 + +```SQL +dictionary_get('dictionary_object_name', key_expression_list, [NULL_IF_NOT_EXIST]) + +key_expression_list ::= + key_expression [, ...] + +key_expression ::= + column_name | const_value +``` + +## 参数说明 + +- `dictionary_name`:字典对象名称。 +- `key_expression_list`:所有 key 列的表达式列表,可以为列名列表或者值列表。 +- `NULL_IF_NOT_EXIST`(选填):当字段缓存中不存在该 key 时,是否返回 NULL。 + - `true`:Key 不存在时 返回 NULL。 + - `false` (默认):Key 不存在时返回错误。 + +## 返回值说明 + +以 STRUCT 类型返回 value 列值。因此可以使用 `[N]` 或者 `.` 来指定返回特定列的值。`N` 表示列的位置,起始位置从 1 开始。 + +## 示例 + +以下示例使用 [dict_mapping](dict_mapping.md) 示例中的数据集。 + +**示例一**:查询字典对象 `dict_obj` 中 key 列 `order_uuid` 所映射的 value 列值。 + +```Plain +MySQL > SELECT dictionary_get('dict_obj', order_uuid) FROM dict; ++--------------------+ +| DICTIONARY_GET | ++--------------------+ +| {"order_id_int":1} | +| {"order_id_int":3} | +| {"order_id_int":2} | ++--------------------+ +3 rows in set (0.02 sec) +``` + +**示例二**:查询字典对象 `dict_obj` 中 key `a1` 所映射的 value 列值。 + +```Plain +MySQL > SELECT dictionary_get("dict_obj", "a1"); ++--------------------+ +| DICTIONARY_GET | ++--------------------+ +| {"order_id_int":1} | ++--------------------+ +1 row in set (0.01 sec) +``` + +**示例三**:查询字典对象 `dimension_obj` 中 key `1` 所映射的 value 列值。 + +```Plain +MySQL > SELECT dictionary_get("dimension_obj", 1); ++-----------------------------------------------------------------------------------------------------------------+ +| DICTIONARY_GET | ++-----------------------------------------------------------------------------------------------------------------+ +| {"ProductName":"T-Shirt","Category":"Apparel","SubCategory":"Shirts","Brand":"BrandA","Color":"Red","Size":"M"} | ++-----------------------------------------------------------------------------------------------------------------+ +1 row in set (0.01 sec) +``` + +**示例四**:查询字典对象 `dimension_obj` 中 key `1` 所映射的第一个 value 列值。 + +```Plain +MySQL > SELECT dictionary_get("dimension_obj", 1)[1]; ++-------------------+ +| DICTIONARY_GET[1] | ++-------------------+ +| T-Shirt | ++-------------------+ +1 row in set (0.01 sec) +``` + +**示例五**:查询字典对象 `dimension_obj` 中 key `1` 所映射的第二个 value 列值。 + +```Plain +MySQL > SELECT dictionary_get("dimension_obj", 1)[2]; ++-------------------+ +| DICTIONARY_GET[2] | ++-------------------+ +| Apparel | ++-------------------+ +1 row in set (0.01 sec) +``` + +**示例六**:查询字典对象 `dimension_obj` 中 key `1` 所映射的 value 列 `ProductName` 的值。 + +```Plain +MySQL > SELECT dictionary_get("dimension_obj", 1).ProductName; ++----------------------------+ +| DICTIONARY_GET.ProductName | ++----------------------------+ +| T-Shirt | ++----------------------------+ +1 row in set (0.01 sec) +``` \ No newline at end of file diff --git a/docs/zh/sql-reference/sql-statements/dictionary/CANCEL_REFRESH_DICTIONARY.md b/docs/zh/sql-reference/sql-statements/dictionary/CANCEL_REFRESH_DICTIONARY.md new file mode 100644 index 0000000000000..68e7d0674fb2e --- /dev/null +++ b/docs/zh/sql-reference/sql-statements/dictionary/CANCEL_REFRESH_DICTIONARY.md @@ -0,0 +1,28 @@ +--- +displayed_sidebar: docs +--- + +# CANCEL REFRESH DICTIONARY + + + +取消刷新字典对象。 + +## 语法 + +```SQL +CANCEL REFRESH DICTIONARY +``` + +## 参数说明 + +`dictionary_object_name`:处于`REFRESHING`状态的字典对象名称。 + +## 示例 + +取消刷新字典对象 `dict_obj`。 + +```Plain +MySQL > CANCEL REFRESH DICTIONARY dict_obj; +Query OK, 0 rows affected (0.01 sec) +``` \ No newline at end of file diff --git a/docs/zh/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md b/docs/zh/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md new file mode 100644 index 0000000000000..309cb515f8d4d --- /dev/null +++ b/docs/zh/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md @@ -0,0 +1,159 @@ +--- +displayed_sidebar: docs +--- + +# CREATE DICTIONARY + + + +基于原始对象创建字典对象。字典对象会以哈希表的形式组织**原始对象中的键值对映射关系**,并缓存在所有 BE 节点的内存中,可以视为一张缓存表。 + +**优势** + +- **字典对象所基于的原始对象更加丰富**:使用 `dictionary_get()` 查询字典对象,字典对象所基于的原始对象可以为所有表类型、异步物化视图和逻辑视图。然而使用 `dict_mapping()` 查询字典表时仅支持字典表为主键表。 +- **查询字典对象的速度快**:由于字典对象是哈希表,并且全量缓存在所有 BE 的内存中,查询字典对象获得映射关系的操作是通过查找内存中哈希表完成的,因此查询字典对象速度非常快。 +- **支持多个 value 列**:字典对象内部将多个 value 列编码为一个 STRUCT 类型的列,后续基于 key 查询时多个 value 可以一起返回。因此字典对象可以作为维度表,维度表中的每个 key(通常是一个唯一标识符)对应多个 value(描述性属性)。 +- **确保一致的快照读取**:同一个事务中获取的字典快照是一致的。因此可以保证在同一个查询或者导入过程中,查询字典对象的结果是一致的,不会有变化。 + +## 语法 + +```SQL +CREATE DICTIONARY USING +( + column_name KEY, [..., column_name KEY,] + column_name VALUE[, ..., column_name VALUE] +) +[PROPERTIES ("key"="value", ...)]; +``` + +## 参数说明 + +- `dictionary_object_name`:字典对象的名称。字典对象全局级别生效,不从属于某个数据库。 +- `dictionary_source`:字典对象所基于原始对象的名称。原始对象可以为所有表类型、异步物化视图和逻辑视图。 +- 字典对象中列的定义:为了保存字典表中维护好的键值对映射关系,您需要在字典对象的列中使用 `KEY` 和 `VALUE` 关键字指定键和其映射的值。并且需要注意: + - 字典对象中的列名 `column_name` 需要和字典表中的列名保持一致。 + - 字典对象中 key 和 value 列的数据类型仅限于布尔类型、整数类型、字符串类型和日期类型。 + - 原始对象中必须确保 key 列的唯一性。 +- 字典对象相关属性 `PROPERTIES`: + - `dictionary_warm_up`:缓存数据至各个 BE 节点中字典对象的触发方式,取值:`TRUE` (默认)或 `FALSE` 。如果为 `TRUE`,则创建字典对象后,自动缓存数据至字典对象;如果为 `FALSE`,则需要您手动刷新字典对象,则才能缓存数据至字典对象。 + - `dictionary_memory_limit`:各个 BE 节点上字典对象可占用的最大内存,单位:Byte,默认为 2,000,000,000 Bytes(2 GB)。 + - `dictionary_refresh_interval`:周期性刷新字典对象的时间间隔,单位:秒,默认为 `0`,取值为 `<=0` 时表示不会自动刷新。 + - `dictionary_read_latest`:是否只查询最新的字典对象,主要影响刷新字典对象时所查询的字典对象,取值:`TRUE` 或 `FALSE`(默认)。如果设置为 `TRUE`,在刷新时无法查询字典对象,因为最新的字典对象还在刷新中。如果为设置为 `FALSE`,则在刷新时可以查询上一次成功缓存字典对象。 + - `dictionary_ignore_failed_refresh`:刷新失败是否自动回滚为前一次成功缓存的字典对象,取值:`TRUE` 或 `FALSE`(默认)。如果设置为 `TRUE`,在刷新失败时,则自动回滚为前一次成功缓存的字典对象。如果为 `FALSE`,在刷新失败时,则将字典对象状态设置为 `CANCELLED`。 + +## 使用说明 + +- 字典对象全量缓存在各个 BE 的内存中,比较消耗内存。 +- 即使原始对象删除后,如果已经基于原始对象创建字典对象,则该字典对象依旧存在。需要您手动 DROP 字典对象。 + +## 示例 + +**示例一:创建一张简单的字典对象,替代原先的字典表。** + +以如下字典表为例,并插入测试数据。 + +```Plain +MySQL > CREATE TABLE dict ( + order_uuid STRING, + order_id_int BIGINT AUTO_INCREMENT +) +PRIMARY KEY (order_uuid) +DISTRIBUTED BY HASH (order_uuid); +Query OK, 0 rows affected (0.02 sec) +MySQL > INSERT INTO dict (order_uuid) VALUES ('a1'), ('a2'), ('a3'); +Query OK, 3 rows affected (0.12 sec) +{'label':'insert_9e60b0e4-89fa-11ee-a41f-b22a2c00f66b', 'status':'VISIBLE', 'txnId':'15029'} +MySQL > SELECT * FROM dict; ++------------+--------------+ +| order_uuid | order_id_int | ++------------+--------------+ +| a1 | 1 | +| a3 | 3 | +| a2 | 2 | ++------------+--------------+ +3 rows in set (0.01 sec) +``` + +基于该字典表中的映射关系创建字典对象。 + +```Plain +MySQL > CREATE DICTIONARY dict_obj USING dict + (order_uuid KEY, + order_id_int VALUE); +Query OK, 0 rows affected (0.00 sec) +``` + +后续查询字典表中的映射关系时,无需查询字典表,直接查询字典对象即可获得映射值。例如查询 key `a1` 映射的 value。 + +```Plain +MySQL > SELECT dictionary_get("dict_obj", "a1"); ++--------------------+ +| DICTIONARY_GET | ++--------------------+ +| {"order_id_int":1} | ++--------------------+ +1 row in set (0.01 sec) +``` + +**示例二:创建字典对象,替代原先的维度表。** + +以如下维度表为例,并插入测试数据。 + +```Plain +MySQL > CREATE TABLE ProductDimension ( + ProductKey BIGINT AUTO_INCREMENT, + ProductName VARCHAR(100) NOT NULL, + Category VARCHAR(50), + SubCategory VARCHAR(50), + Brand VARCHAR(50), + Color VARCHAR(20), + Size VARCHAR(20) +) +PRIMARY KEY (ProductKey) +DISTRIBUTED BY HASH (ProductKey); +MySQL > INSERT INTO ProductDimension (ProductName, Category, SubCategory, Brand, Color, Size) +VALUES + ('T-Shirt', 'Apparel', 'Shirts', 'BrandA', 'Red', 'M'), + ('Jeans', 'Apparel', 'Pants', 'BrandB', 'Blue', 'L'), + ('Running Shoes', 'Footwear', 'Athletic', 'BrandC', 'Black', '10'), + ('Jacket', 'Apparel', 'Outerwear', 'BrandA', 'Green', 'XL'), + ('Baseball Cap', 'Accessories', 'Hats', 'BrandD', 'White', 'OneSize'); +Query OK, 5 rows affected (0.48 sec) +{'label':'insert_e938481f-181e-11ef-a6a9-00163e19e14e', 'status':'VISIBLE', 'txnId':'50'} +MySQL > SELECT * FROM ProductDimension; ++------------+---------------+-------------+-------------+--------+-------+---------+ +| ProductKey | ProductName | Category | SubCategory | Brand | Color | Size | ++------------+---------------+-------------+-------------+--------+-------+---------+ +| 1 | T-Shirt | Apparel | Shirts | BrandA | Red | M | +| 4 | Jacket | Apparel | Outerwear | BrandA | Green | XL | +| 5 | Baseball Cap | Accessories | Hats | BrandD | White | OneSize | +| 2 | Jeans | Apparel | Pants | BrandB | Blue | L | +| 3 | Running Shoes | Footwear | Athletic | BrandC | Black | 10 | ++------------+---------------+-------------+-------------+--------+-------+---------+ +5 rows in set (0.02 sec) +``` + +创建字典对象替代原先的维度表。 + +```Plain +MySQL > CREATE DICTIONARY dimension_obj USING ProductDimension + (ProductKey KEY, + ProductName VALUE, + Category value, + SubCategory value, + Brand value, + Color value, + Size value); +``` + +后续查询维度值时,无需查询维度表,直接查询字典对象即可获得维度值。例如查询 key `1` 映射的 value。 + +```Plain +MySQL > SELECT dictionary_get("dimension_obj", "1"); ++--------------------+ +| DICTIONARY_GET | ++--------------------+ +| {"order_id_int":1} | ++--------------------+ +1 row in set (0.01 sec) +``` \ No newline at end of file diff --git a/docs/zh/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md b/docs/zh/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md new file mode 100644 index 0000000000000..236fbbdaea32e --- /dev/null +++ b/docs/zh/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md @@ -0,0 +1,63 @@ +--- +displayed_sidebar: docs +--- + +# DROP DICTIONARY + + + +删除字典对象或者清空字典对象中缓存的数据。 + +## 语法 + +```SQL +DROP DICTIONARY [ CACHE ] +``` + +## 参数说明 + +- `dictionary_object_name`:字典对象的名称。 +- `CACHE`:如果写关键词 `CACHE`,则表示仅清空字典对象中缓存的数据,后续需要恢复字典对象中的缓存数据,则可以手动刷新。如果不写关键词 `CACHE`,则表示删除字典对象。 + +## 示例 + +- **示例一**:只清空字典对象中缓存的数据。 + + ```Plain + DROP DICTIONARY dict_obj CACHE; + ``` + + 该字典对象依旧存在。 + + ```Plain + MySQL > SHOW DICTIONARY dict_obj\G + *************************** 1. row *************************** + DictionaryId: 5 + DictionaryName: dict_obj + DbName: example_db + dictionaryObject: dict + dictionaryKeys: [order_uuid] + dictionaryValues: [order_id_int] + status: UNINITIALIZED + lastSuccessRefreshTime: 2024-05-24 12:59:10 + lastSuccessFinishedTime: 2024-05-24 12:59:20 + nextSchedulableTime: disable auto schedule for refreshing + ErrorMessage: + approximated dictionaryMemoryUsage (Bytes): 172.26.80.55:8060 : 0 + 172.26.80.56:8060 : 0 + 172.26.80.57:8060 : 0 + 1 row in set (0.00 sec) + ``` + +- **示例二**:删除字典对象 `dict_obj`。 + + ```Plain + DROP DICTIONARY dict_obj; + ``` + + 该字典对象完全删除,不再存在。 + + ```Plain + MySQL > SHOW DICTIONARY dict_obj; + Empty set (0.00 sec) + ``` \ No newline at end of file diff --git a/docs/zh/sql-reference/sql-statements/dictionary/REFRESH_DICTIONARY.md b/docs/zh/sql-reference/sql-statements/dictionary/REFRESH_DICTIONARY.md new file mode 100644 index 0000000000000..1cd29684c636c --- /dev/null +++ b/docs/zh/sql-reference/sql-statements/dictionary/REFRESH_DICTIONARY.md @@ -0,0 +1,28 @@ +--- +displayed_sidebar: docs +--- + +# REFRESH DICTIONARY + + + +手动刷新字典对象,内部实现时系统会查询原始对象的最新数据并写入字典对象中。 + +## 语法 + +```SQL +REFRESH DICTIONARY +``` + +## 参数说明 + +`dictionary_object_name`:字典对象的名称。 + +## 示例 + +手动刷新字典对象 `dict_obj`。 + +```Plain +MySQL > REFRESH DICTIONARY dict_obj; +Query OK, 0 rows affected (0.01 sec) +``` \ No newline at end of file diff --git a/docs/zh/sql-reference/sql-statements/dictionary/SHOW_DICTIONARY.md b/docs/zh/sql-reference/sql-statements/dictionary/SHOW_DICTIONARY.md new file mode 100644 index 0000000000000..6579cde9e7786 --- /dev/null +++ b/docs/zh/sql-reference/sql-statements/dictionary/SHOW_DICTIONARY.md @@ -0,0 +1,76 @@ +--- +displayed_sidebar: docs +--- + +# SHOW DICTIONARY + + + +查询字典对象的信息。 + +## 语法 + +```SQL +SHOW DICTIONARY [ ] +``` + +## 参数说明 + +`dictionary_object_name`:字典对象的名称。 + +## 返回 + +- `DictionaryId`:每个字典对象的唯一 ID。 +- `DictionaryName`:字典对象名称。 +- `DbName`:原始对象所属的数据库。 +- `dictionarySource`:原始对象的名称。 +- `dictionaryKeys`:字典对象的 key 列。 +- `dictionaryValues`:字典对象的 value 列。 +- `state`:字典对象的刷新状态。 + - `UNINITIALIZED`:刚创建字典对象,还未被刷新; + - `REFRESHING`:正在刷新字典对象。 + - `COMMITTING`: 数据已经缓存在各个 BE 的字典对象中,正在做一些收尾工作,例如更新元数据信息等。 + - `FINISHED`:刷新成功,结束刷新。 + - `CANCELLED`:刷新过程中出错,已取消刷新。您需要根据 `ErrorMessage` 中的报错信息进行修复,然后再次刷新字典对象。 +- `lastSuccessRefreshTime`:字典对象上次成功刷新开始的时间。 +- `lastSuccessFinishedTime`:字典对象上次成功刷新结束的时间。 +- `nextSchedulableTime`:下一次自动刷新字典对象的时间。 +- `ErrorMessage`:刷新字典对象失败时的报错信息。 +- `approximated dictionaryMemoryUsage (Bytes)`:每个 BE 节点上缓存字典对象占用内存的估计值。 + +## 示例 + +```Plain +MySQL > SHOW DICTIONARY\G +*************************** 1. row *************************** + DictionaryId: 3 + DictionaryName: dict_obj + DbName: example_db + dictionaryObject: dict + dictionaryKeys: [order_uuid] + dictionaryValues: [order_id_int] + status: FINISHED + lastSuccessRefreshTime: 2024-05-23 16:12:03 + lastSuccessFinishedTime: 2024-05-23 16:12:12 + nextSchedulableTime: disable auto schedule for refreshing + ErrorMessage: +approximated dictionaryMemoryUsage (Bytes): 172.26.82.208:8060 : 30 +172.26.82.210:8060 : 30 +172.26.82.209:8060 : 30 +*************************** 2. row *************************** + DictionaryId: 4 + DictionaryName: dimension_obj + DbName: example_db + dictionaryObject: ProductDimension + dictionaryKeys: [ProductKey] + dictionaryValues: [ProductName, Category, SubCategory, Brand, Color, Size] + status: FINISHED + lastSuccessRefreshTime: 2024-05-23 16:12:41 + lastSuccessFinishedTime: 2024-05-23 16:12:42 + nextSchedulableTime: disable auto schedule for refreshing + ErrorMessage: +approximated dictionaryMemoryUsage (Bytes): 172.26.82.208:8060 : 270 +172.26.82.210:8060 : 270 +172.26.82.209:8060 : 270 +2 rows in set (0.00 sec) +```