Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add Dict Obj Docs #50788

Merged
merged 1 commit into from
Sep 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions docs/en/sql-reference/sql-functions/dict-functions/dictionary_get.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
---
displayed_sidebar: docs
---

# dictionary_get



Query the value mapped to the key in a dictionary object.

## Syntax

```SQL
dictionary_get('dictionary_object_name', key_expression_list, [NULL_IF_NOT_EXIST])

key_expression_list ::=
key_expression [, ...]

key_expression ::=
column_name | const_value
```

## Parameters

- `dictionary_name`: The name of the dictionary object.
- `key_expression_list`: A list of expressions for all key columns. It can be a list of column names or a list of values.
- `NULL_IF_NOT_EXIST` (Optional): Whether to return if the key does not exist in the dictionary cache. Valid values:
- `true`: Null is returned if the key does not exist.
- `false` (Default): An exception is thrown if the key does not exist.

## Returns

Returns the values of value columns as a STRUCT type. Therefore, you can use `[N]` or `.<column_name>` to specify a particular column's value. `N` represents the column's position, starting from 1.

## Examples

The following examples uses the dataset from the examples of [dict_mapping](dict_mapping.md).

- Example 1: Query the values of the value column mapped to the key column `order_uuid` in the dictionary object `dict_obj`.

```Plain
MySQL > SELECT dictionary_get('dict_obj', order_uuid) FROM dict;
+--------------------+
| DICTIONARY_GET |
+--------------------+
| {"order_id_int":1} |
| {"order_id_int":3} |
| {"order_id_int":2} |
+--------------------+
3 rows in set (0.02 sec)
```

- Example 2: Query the value of the value column mapped to key `a1` in the dictionary object `dict_obj`.

```Plain
MySQL > SELECT dictionary_get("dict_obj", "a1");
+--------------------+
| DICTIONARY_GET |
+--------------------+
| {"order_id_int":1} |
+--------------------+
1 row in set (0.01 sec)
```

- Example 3: Query the values of the value columns mapped to key `1` in the dictionary object `dimension_obj`.

```Plain
MySQL > SELECT dictionary_get("dimension_obj", 1);
+-----------------------------------------------------------------------------------------------------------------+
| DICTIONARY_GET |
+-----------------------------------------------------------------------------------------------------------------+
| {"ProductName":"T-Shirt","Category":"Apparel","SubCategory":"Shirts","Brand":"BrandA","Color":"Red","Size":"M"} |
+-----------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
```

- Example 4: Query the value of the first value column mapped to key `1` in the dictionary object `dimension_obj`.

```Plain
MySQL > SELECT dictionary_get("dimension_obj", 1)[1];
+-------------------+
| DICTIONARY_GET[1] |
+-------------------+
| T-Shirt |
+-------------------+
1 row in set (0.01 sec)
```

- Example 5: Query the value of the second value column mapped to key `1` in the dictionary object `dimension_obj`.

```Plain
MySQL > SELECT dictionary_get("dimension_obj", 1)[2];
+-------------------+
| DICTIONARY_GET[2] |
+-------------------+
| Apparel |
+-------------------+
1 row in set (0.01 sec)
```

- Example 6: Query the value of `ProductName` value column mapped to key `1` in the dictionary object `dimension_obj`.

```Plain
MySQL > SELECT dictionary_get("dimension_obj", 1).ProductName;
+----------------------------+
| DICTIONARY_GET.ProductName |
+----------------------------+
| T-Shirt |
+----------------------------+
1 row in set (0.01 sec)
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
displayed_sidebar: docs
---

# CANCEL REFRESH DICTIONARY



Cancels the refresh of a dictionary object.

## Syntax

```SQL
CANCEL REFRESH DICTIONARY <dictionary_object_name>
```

## Parameters

- **dictionary_object_name**: The name of the dictionary object that is in the REFRESHING state.

## Examples

Cancel the refresh of the dictionary object `dict_obj`.

```Plain
MySQL > CANCEL REFRESH DICTIONARY dict_obj;
Query OK, 0 rows affected (0.01 sec)
```
160 changes: 160 additions & 0 deletions docs/en/sql-reference/sql-statements/dictionary/CREATE_DICTIONARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
---
displayed_sidebar: docs
---

# CREATE DICTIONARY



Creates a dictionary object based on an original object. The dictionary object organizes the key-value mappings from the original object in the form of a hash table and is cached in the memory of all BE nodes. It can be viewed as a cached table.

**Advantages**

- **Richer original objects for dictionary objects**: When using `dictionary_get()` to query dictionary objects, the original object can be a table of any type, asynchronous materialized view, or logical view. However, when using `dict_mapping()` to query dictionary tables, the dictionary tables can only be primary key tables.
- **Fast query speed**: Since the dictionary object is a hash table and fully cached in the memory of all BE nodes, querying the dictionary object to get the mapping is realized by looking up the hash table in memory. Therefore, the query speed is very fast.
- **Supports multiple value columns**: Internally, the dictionary object encodes multiple value columns into a single STRUCT type column. For queries based on a key, multiple values are returned together. Therefore, the dictionary object can serve as a dimension table where each key (usually a unique identifier) corresponds to multiple values (descriptive attributes).
- **Ensures consistent snapshot reads**: The dictionary snapshot obtained within the same transaction is consistent, ensuring that the query results from the dictionary object do not change during the same query or load process.

## Syntax

```SQL
CREATE DICTIONARY <dictionary_object_name> USING <dictionary_source>
(
column_name KEY, [..., column_name KEY,]
column_name VALUE[, ..., column_name VALUE]
)
[PROPERTIES ("key"="value", ...)];
```

## Parameters

- `dictionary_object_name`: The name of the dictionary object. The dictionary object is effective globally and does not belong to a specific database.
- `dictionary_source`: The name of the original object on which the dictionary object is based. The original object can be a table of any type, asynchronous materialized view, or logical view.
- Definition of columns in the dictionary object: To preserve the key-value mapping maintained in the dictionary table, you need to use the `KEY` and `VALUE` keywords in the dictionary object's columns to specify the keys and their mapped values.
- The column names `column_name` in the dictionary object must be consistent with those in the dictionary table.
- The data types for key and value columns in the dictionary object are limited to boolean, integer, string, and date types.
- The key column in the original object must ensure uniqueness.
- Related properties of dictionary objects (`PROPERTIES`):
- `dictionary_warm_up`: The method to cache data into the dictionary object on each BE node. Valid values: `TRUE` (default) or `FALSE`. If the parameter is set to `TRUE`, data is automatically cached into the dictionary object after its creation; if the parameter is set to `FALSE`, you need to manually refresh the dictionary object to cache the data.
- `dictionary_memory_limit`: The maximum memory the dictionary object can occupy on each BE node. Unit: bytes. Default value: 2,000,000,000 bytes (2 GB).
- `dictionary_refresh_interval`: The interval for periodically refreshing the dictionary object. Unit: seconds. Default value: `0`. A value `<=0` means no automatic refresh.
- `dictionary_read_latest`: Whether to only query the latest dictionary object, mainly affecting the dictionary object queried during refresh. Valid values: `TRUE` or `FALSE` (default). If the parameter is set to `TRUE`, the dictionary object cannot be queried during refresh because the latest dictionary object is still being refreshed. If the parameter is set to `FALSE`, the previously successfully cached dictionary object can be queried during refresh.
- `dictionary_ignore_failed_refresh`: Whether to automatically roll back to the last successfully cached dictionary object if the refresh fails. Valid values: `TRUE` or `FALSE` (default). If the parameter is set to `TRUE`, it automatically rolls back to the last successfully cached dictionary object when the refresh fails. If the parameter is set to `FALSE`, the dictionary object status is set to `CANCELLED` when the refresh fails.

## Usage notes

- The dictionary object is fully cached in the memory of each BE node, so it consumes relatively more memory.
- Even if the original object is deleted, the dictionary object created based on it still exists. You need to manually DROP the dictionary object.

## Examples

**Example 1: Create a simple dictionary object to replace the original dictionary table.**

Take the following dictionary table as an example and insert test data.

```Plain
MySQL > CREATE TABLE dict (
order_uuid STRING,
order_id_int BIGINT AUTO_INCREMENT
)
PRIMARY KEY (order_uuid)
DISTRIBUTED BY HASH (order_uuid);
Query OK, 0 rows affected (0.02 sec)
MySQL > INSERT INTO dict (order_uuid) VALUES ('a1'), ('a2'), ('a3');
Query OK, 3 rows affected (0.12 sec)
{'label':'insert_9e60b0e4-89fa-11ee-a41f-b22a2c00f66b', 'status':'VISIBLE', 'txnId':'15029'}
MySQL > SELECT * FROM dict;
+------------+--------------+
| order_uuid | order_id_int |
+------------+--------------+
| a1 | 1 |
| a2 | 2 |
| a3 | 3 |
+------------+--------------+
3 rows in set (0.01 sec)
```

Create a dictionary object based on the mappings in this dictionary table.

```Plain
MySQL > CREATE DICTIONARY dict_obj USING dict
(order_uuid KEY,
order_id_int VALUE);
Query OK, 0 rows affected (0.00 sec)
```

For future queries of the mappings in the dictionary table, you can directly query the dictionary object instead of the dictionary table. For example, query the value mapped by key `a1`.

```Plain
MySQL > SELECT dictionary_get("dict_obj", "a1");
+--------------------+
| DICTIONARY_GET |
+--------------------+
| {"order_id_int":1} |
+--------------------+
1 row in set (0.01 sec)
```

**Example 2: Create a dictionary object to replace the original dimension table**

Take the following dimension table as an example and insert test data.

```Plain
MySQL > CREATE TABLE ProductDimension (
ProductKey BIGINT AUTO_INCREMENT,
ProductName VARCHAR(100) NOT NULL,
Category VARCHAR(50),
SubCategory VARCHAR(50),
Brand VARCHAR(50),
Color VARCHAR(20),
Size VARCHAR(20)
)
PRIMARY KEY (ProductKey)
DISTRIBUTED BY HASH (ProductKey);
MySQL > INSERT INTO ProductDimension (ProductName, Category, SubCategory, Brand, Color, Size)
VALUES
('T-Shirt', 'Apparel', 'Shirts', 'BrandA', 'Red', 'M'),
('Jeans', 'Apparel', 'Pants', 'BrandB', 'Blue', 'L'),
('Running Shoes', 'Footwear', 'Athletic', 'BrandC', 'Black', '10'),
('Jacket', 'Apparel', 'Outerwear', 'BrandA', 'Green', 'XL'),
('Baseball Cap', 'Accessories', 'Hats', 'BrandD', 'White', 'OneSize');
Query OK, 5 rows affected (0.48 sec)
{'label':'insert_e938481f-181e-11ef-a6a9-00163e19e14e', 'status':'VISIBLE', 'txnId':'50'}
MySQL > SELECT * FROM ProductDimension;
+------------+---------------+-------------+-------------+--------+-------+---------+
| ProductKey | ProductName | Category | SubCategory | Brand | Color | Size |
+------------+---------------+-------------+-------------+--------+-------+---------+
| 1 | T-Shirt | Apparel | Shirts | BrandA | Red | M |
| 2 | Jeans | Apparel | Pants | BrandB | Blue | L |
| 3 | Running Shoes | Footwear | Athletic | BrandC | Black | 10 |
| 4 | Jacket | Apparel | Outerwear | BrandA | Green | XL |
| 5 | Baseball Cap | Accessories | Hats | BrandD | White | OneSize |
+------------+---------------+-------------+-------------+--------+-------+---------+
5 rows in set (0.02 sec)
```

Create a dictionary object to replace the original dimension table.

```Plain
MySQL > CREATE DICTIONARY dimension_obj USING ProductDimension
(ProductKey KEY,
ProductName VALUE,
Category VALUE,
SubCategory VALUE,
Brand VALUE,
Color VALUE,
Size VALUE);
Query OK, 0 rows affected (0.00 sec)
```

For future queries of dimension values, you can directly query the dictionary object instead of the dimension table to obtain dimension values. For example, query the value mapped by key `1`.

```Plain
MySQL > SELECT dictionary_get("dict_obj", "a1");
+--------------------+
| DICTIONARY_GET |
+--------------------+
| {"order_id_int":1} |
+--------------------+
1 row in set (0.01 sec)
```
63 changes: 63 additions & 0 deletions docs/en/sql-reference/sql-statements/dictionary/DROP_DICTIONARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
displayed_sidebar: docs
---

# DROP DICTIONARY



Delete a dictionary object or clear the cached data within a dictionary object.

## Syntax

```SQL
DROP DICTIONARY <dictionary_object_name> [ CACHE ]
```

## Parameters

- `dictionary_object_name`: The name of the dictionary object.
- `CACHE`: If the keyword `CACHE` is specified, only the cached data within the dictionary object will be cleared. To restore the cached data later, you can manually refresh it. If the keyword `CACHE` is not specified, the dictionary object will be deleted.

## Examples

- Example 1: Clear only the cached data within the dictionary object.

```Plain
DROP DICTIONARY dict_obj CACHE;
```

The dictionary object still exists.

```Plain
MySQL > SHOW DICTIONARY dict_obj\G
*************************** 1. row ***************************
DictionaryId: 5
DictionaryName: dict_obj
DbName: example_db
dictionaryObject: dict
dictionaryKeys: [order_uuid]
dictionaryValues: [order_id_int]
status: UNINITIALIZED
lastSuccessRefreshTime: 2024-05-24 12:59:10
lastSuccessFinishedTime: 2024-05-24 12:59:20
nextSchedulableTime: disable auto schedule for refreshing
ErrorMessage:
approximated dictionaryMemoryUsage (Bytes): 172.26.80.55:8060 : 0
172.26.80.56:8060 : 0
172.26.80.57:8060 : 0
1 row in set (0.00 sec)
```

- Example 2: Delete the dictionary object `dict_obj`.

```Plain
DROP DICTIONARY dict_obj;
```

The dictionary object is completely deleted and no longer exists.

```Plain
MySQL > SHOW DICTIONARY dict_obj;
Empty set (0.00 sec)
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
displayed_sidebar: docs
---

# REFRESH DICTIONARY



Manually refreshes a dictionary object. Internally, the system will query the latest data from the original object and write it into the dictionary object.

## Syntax

```SQL
REFRESH DICTIONARY <dictionary_object_name>
```

## Parameters

- **dictionary_object_name**: The name of the dictionary object.

## Examples

Manually refresh the dictionary object `dict_obj`.

```Plain
MySQL > REFRESH DICTIONARY dict_obj;
Query OK, 0 rows affected (0.01 sec)
```
Loading
Loading