From 0d0eb2d7e15ecf62a6281105e201a117ab09be51 Mon Sep 17 00:00:00 2001 From: mliu <8832717+realmorrisliu@users.noreply.github.com> Date: Tue, 24 Dec 2024 16:53:38 +0800 Subject: [PATCH 1/4] Update quickstart.mdx (#149) fix default folder to store data Signed-off-by: mliu <8832717+realmorrisliu@users.noreply.github.com> --- get-started/quickstart.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/get-started/quickstart.mdx b/get-started/quickstart.mdx index f4bab36e..e415b5f6 100644 --- a/get-started/quickstart.mdx +++ b/get-started/quickstart.mdx @@ -137,7 +137,7 @@ Unlike other deployment modes, for instance [Docker Compose](/deploy/risingwave- For state store, we will use the embedded `LocalFs` Object Store, eliminating the need for an external service like `minio` or `s3`; for meta store, we will use the embedded `SQLite` database, eliminating the need for an external service like `etcd`. -By default, the RisingWave standalone mode will store its data in `~/risingwave`, which includes both `Metadata` and `State Data`. +By default, the RisingWave standalone mode will store its data in `~/.risingwave`, which includes both `Metadata` and `State Data`. For a batteries-included setup, with `monitoring` tools and external services like `kafka` fully included, you can use [Docker Compose](/deploy/risingwave-docker-compose) instead. If you would like to set up these external services manually, you may check out RisingWave's [Docker Compose](https://github.com/risingwavelabs/risingwave/blob/main/docker/docker-compose.yml), and run these services using the same configurations. @@ -147,7 +147,7 @@ The instance of RisingWave standalone mode can run without any configuration. Ho The main options which new users may require would be the state store directory (`--state-store-directory`) and in-memory mode (`--in-memory`). -`--state-store-directory` specifies the new directory where the cluster's `Metadata` and `State Data` will reside. The default is to store it in the `~/risingwave` folder. +`--state-store-directory` specifies the new directory where the cluster's `Metadata` and `State Data` will reside. The default is to store it in the `~/.risingwave` folder. ```bash # Reconfigure RisingWave to be stored under 'projects' folder instead. From d0cc9f1ee9492eacf95d027fe2e94b9c17aa9cc0 Mon Sep 17 00:00:00 2001 From: IrisWan <150207222+WanYixian@users.noreply.github.com> Date: Wed, 25 Dec 2024 11:28:16 +0800 Subject: [PATCH 2/4] update (#158) --- changelog/product-lifecycle.mdx | 1 + ingestion/overview.mdx | 6 +- integrations/sources/mysql-table.mdx | 102 ++++++++++++++++++++++ integrations/sources/postgresql-table.mdx | 6 +- 4 files changed, 111 insertions(+), 4 deletions(-) create mode 100644 integrations/sources/mysql-table.mdx diff --git a/changelog/product-lifecycle.mdx b/changelog/product-lifecycle.mdx index 0799174b..8f0e03da 100644 --- a/changelog/product-lifecycle.mdx +++ b/changelog/product-lifecycle.mdx @@ -22,6 +22,7 @@ Below is a list of all features in the public preview phase: | Feature name | Start version | | :-- | :-- | +| [Ingest data from MySQL table](/integrations/sources/mysql-table)| 2.2 | | [EXPLAIN FORMAT JSON](/sql/commands/sql-explain#explain-options)| 2.2 | | [Ingest data from Postgres table](/integrations/sources/postgresql-table) | 2.1 | | [Ingest data from webhook](/integrations/sources/webhook) | 2.1 | diff --git a/ingestion/overview.mdx b/ingestion/overview.mdx index 64f95d1b..e100df6a 100644 --- a/ingestion/overview.mdx +++ b/ingestion/overview.mdx @@ -74,11 +74,11 @@ The statement will create a streaming job that continuously ingests data from th 4. **Stronger consistency guarantee**: When using a table with connectors, all downstream jobs will be guaranteed to have a consistent view of the data persisted in the table; while for source, different jobs may see inconsistent results due to different ingestion speed or data retention in the external system. 5. **Greater flexibility**: Like regular tables, you can use DML statements like [INSERT](/sql/commands/sql-insert), [UPDATE](/sql/commands/sql-update) and [DELETE](/sql/commands/sql-delete) to insert or modify data in tables with connectors, and use [CREATE SINK INTO TABLE](/sql/commands/sql-create-sink-into) to merge other data streams into the table. -### PostgreSQL table +### Table-valued function -RisingWave supports using the table-valued function `postgres_query` to directly query PostgreSQL databases. This function connects to a specified PostgreSQL instance, executes the provided SQL query, and returns the results as a table in RisingWave. +RisingWave supports using the table-valued function (TVF) `postgres_query` or `mysql_query` to directly query PostgreSQL or MySQL databases. This function connects to a specified instance, executes the provided SQL query, and returns the results as a table in RisingWave. -To use it, specify connection details (such as hostname, port, username, password, database name) and the desired SQL query. This makes it easier to integrate PostgreSQL data directly into RisingWave workflows without needing additional data transfer steps. For more information, see [Ingest data from Postgres tables](/integrations/sources/postgresql-table). +To use it, specify connection details (such as hostname, port, username, password, database name) and the desired SQL query. This makes it easier to integrate databases directly into RisingWave workflows without needing additional data transfer steps. For more information, see [Ingest data from Postgres tables](/integrations/sources/postgresql-table) and [Ingest data from MySQL tables](/integrations/sources/mysql-table). ## DML on tables diff --git a/integrations/sources/mysql-table.mdx b/integrations/sources/mysql-table.mdx new file mode 100644 index 00000000..5d4b0728 --- /dev/null +++ b/integrations/sources/mysql-table.mdx @@ -0,0 +1,102 @@ +--- +title: "Ingest data from MySQL table" +description: "Describes how to ingest data from MySQL table to RisingWave using table-valued function." +sidebarTitle: MySQL table +--- + +RisingWave allows you to query MySQL tables directly with the `mysql_query` table-valued function (TVF). It offers a simpler alternative to Change Data Capture (CDC) when working with MySQL data in RisingWave. + +Unlike CDC, which continuously syncs data changes, this function lets you fetch data directly from MySQL when needed. Therefore, this approach is ideal for static or infrequently updated data, as it's more resource-efficient than maintaining a constant CDC connection. + + +**PUBLIC PREVIEW** + +This feature is currently in public preview, meaning it is nearing the final product but may not yet be fully stable. If you encounter any issues or have feedback, please reach out to us via our [Slack channel](https://www.risingwave.com/slack). Your input is valuable in helping us improve this feature. For more details, see our [Public Preview Feature List](/changelog/product-lifecycle#features-in-the-public-preview-stage). + + + +Added in version 2.2. + + +## Syntax + +Define `mysql_query` as follows: + +```sql +mysql_query( + hostname varchar, -- Database hostname + port varchar, -- Database port + username varchar, -- Authentication username + password varchar, -- Authentication password + database_name varchar, -- Target database name + query varchar -- SQL query to execute +) +``` + +## Data type mapping + +The following table shows how MySQL data types are mapped to RisingWave data types: + +| MySQL Type | RisingWave Type | +|:-----------|:----------------| +| `bit(1)` | `boolean` | +| `bit(>1)` | `bytea` | +| `bool`/`boolean` | `smallint` | +| `tinyint` | `smallint` | +| `smallint` | `smallint` | +| `mediumint` | `int` | +| `int` | `int` | +| `bigint` | `bigint` | +| `float` | `float32` | +| `double` | `float64` | +| `decimal` | `decimal` | +| `numeric` | `decimal` | +| `year` | `int` | +| `date` | `date` | +| `time` | `time` | +| `datetime` | `timestamp` | +| `timestamp` | `timestamptz` | +| `varchar` | `varchar` | +| `char` | `varchar` | +| `json` | `jsonb` | +| `blob` | `bytea` | +| `tinyblob` | `bytea` | +| `mediumblob` | `bytea` | +| `longblob` | `bytea` | +| `array` | *unsupported* | +| `enum` | *unsupported* | +| `set` | *unsupported* | +| `geometry` | *unsupported* | +| `null` | *unsupported* | + +## Example + +1. In your MySQL database, create a table and populate it with sample data of various data types. + +```sql +CREATE TABLE test ( + id bigint primary key, v0 bit, v1 bool, v2 tinyint(1), + v3 tinyint(2), v4 smallint, v5 mediumint, v6 integer, + v7 bigint, v8 float, v9 double, v10 numeric(4, 2), + v11 decimal(4, 2), v12 char(255), v13 varchar(255), + v14 bit(10), v15 tinyblob, v16 blob, v17 mediumblob, + v18 longblob, v19 date, v20 time, v21 timestamp, + v22 json +); + +INSERT INTO test SELECT + 1 as id, true as v0, true as v1, 2 as v2, 3 as v3, 4 as v4, 5 as v5, + 6 as v6, 7 as v7, 1.08 as v8, 1.09 as v9, 1.10 as v10, 1.11 as v11, + 'char' as v12, 'varchar' as v13, b'1010' as v14, x'16' as v15, x'17' as v16, + x'18' as v17, x'19' as v18, '2021-01-01' as v19, '12:34:56' as v20, + '2021-01-01 12:34:56' as v21, JSON_OBJECT('key1', 1, 'key2', 'abc'); +``` + +2. In RisingWave, use `postgres_query` function to perform the query. + +```sql +SELECT * +FROM mysql_query('$MYSQL_HOST', '$MYSQL_TCP_PORT', '$RISEDEV_MYSQL_USER', '$MYSQL_PWD', 'tvf', 'select * from test;'); +----RESULT +1 t 1 2 3 4 5 6 7 1.08 1.09 1.10 1.11 char varchar \x000a \x16 \x17 \x18 \x19 2021-01-01 12:34:56 2021-01-01 12:34:56+00:00 {"key1": 1, "key2": "abc"} +``` diff --git a/integrations/sources/postgresql-table.mdx b/integrations/sources/postgresql-table.mdx index 43c08767..ff83fd01 100644 --- a/integrations/sources/postgresql-table.mdx +++ b/integrations/sources/postgresql-table.mdx @@ -6,7 +6,7 @@ sidebarTitle: PostgreSQL table RisingWave allows you to query PostgreSQL tables directly with the `postgres_query` table-valued function (TVF). It offers a simpler alternative to Change Data Capture (CDC) when working with PostgreSQL data in RisingWave. -Unlike CDC, which continuously syncs data changes, this function lets you fetch data directly from PostgreSQL when needed. Therefore, this approach is ideal static or infrequently updated data, as it's more resource-efficient than maintaining a constant CDC connection. +Unlike CDC, which continuously syncs data changes, this function lets you fetch data directly from PostgreSQL when needed. Therefore, this approach is ideal for static or infrequently updated data, as it's more resource-efficient than maintaining a constant CDC connection. **PUBLIC PREVIEW** @@ -14,6 +14,10 @@ Unlike CDC, which continuously syncs data changes, this function lets you fetch This feature is currently in public preview, meaning it is nearing the final product but may not yet be fully stable. If you encounter any issues or have feedback, please reach out to us via our [Slack channel](https://www.risingwave.com/slack). Your input is valuable in helping us improve this feature. For more details, see our [Public Preview Feature List](/changelog/product-lifecycle#features-in-the-public-preview-stage). + +Added in version 2.1. + + ## Syntax Define `postgres_query` as follows: From c525cb49f1eba5752aa3055153c14cb001cf5ef1 Mon Sep 17 00:00:00 2001 From: hengm3467 <100685635+hengm3467@users.noreply.github.com> Date: Thu, 26 Dec 2024 07:34:20 +0800 Subject: [PATCH 3/4] Add a new section to describe how to serve results and query data from RW (#159) * add serve results section * Update serve/overview.mdx Co-authored-by: Tao Wu Signed-off-by: hengm3467 <100685635+hengm3467@users.noreply.github.com> * Update product-lifecycle.mdx * Update overview.mdx * change serve to query * Update product-lifecycle.mdx --------- Signed-off-by: hengm3467 <100685635+hengm3467@users.noreply.github.com> Co-authored-by: Tao Wu --- changelog/product-lifecycle.mdx | 4 +- mint.json | 19 ++++- query/overview.mdx | 77 +++++++++++++++++++ query/query-from-visualization-tools.mdx | 4 + query/query-with-select.mdx | 4 + .../risingwave-as-postgres-fdw.mdx | 0 {delivery => query}/subscription.mdx | 0 7 files changed, 102 insertions(+), 6 deletions(-) create mode 100644 query/overview.mdx create mode 100644 query/query-from-visualization-tools.mdx create mode 100644 query/query-with-select.mdx rename {delivery => query}/risingwave-as-postgres-fdw.mdx (100%) rename {delivery => query}/subscription.mdx (100%) diff --git a/changelog/product-lifecycle.mdx b/changelog/product-lifecycle.mdx index 8f0e03da..29180d7c 100644 --- a/changelog/product-lifecycle.mdx +++ b/changelog/product-lifecycle.mdx @@ -41,8 +41,8 @@ Below is a list of all features in the public preview phase: | Auto-map upstream table schema in [MySQL](/integrations/sources/mysql-cdc#automatically-map-upstream-table-schema) and [PostgreSQL](/integrations/sources/postgresql-cdc#automatically-map-upstream-table-schema) | 1.10 | | [Version column](/sql/commands/sql-create-table#pk-conflict-behavior) | 1.9 | | [Snowflake sink](/integrations/destinations/snowflake) | 1.9 | -| [Subscription](/delivery/subscription) | 1.9 | -| [RisingWave as PostgreSQL FDW](/delivery/risingwave-as-postgres-fdw) | 1.9 | +| [Subscription](/query/subscription) | 1.9 | +| [RisingWave as PostgreSQL FDW](/query/risingwave-as-postgres-fdw) | 1.9 | | [Iceberg source](/integrations/sources/apache-iceberg) | 1.8 | | [Google BigQuery sink](/integrations/destinations/bigquery) | 1.4 | | [SET BACKGROUND\_DDL command](/sql/commands/sql-set-background-ddl) | 1.3 | diff --git a/mint.json b/mint.json index fcca0176..bec1e1ad 100644 --- a/mint.json +++ b/mint.json @@ -374,7 +374,9 @@ {"source": "/docs/current/risingwave-flink-comparison", "destination": "/faq/risingwave-flink-comparison"}, {"source": "/docs/current/faq-using-risingwave", "destination": "/faq/faq-using-risingwave"}, {"source": "/release-notes", "destination": "/changelog/release-notes"}, - {"source": "/product-lifecycle", "destination": "/changelog/product-lifecycle"} + {"source": "/product-lifecycle", "destination": "/changelog/product-lifecycle"}, + {"source": "/delivery/risingwave-as-postgres-fdw", "destination": "/query/risingwave-as-postgres-fdw"}, + {"source": "/delivery/subscription", "destination": "/query/subscription"} ], "navigation": [ { @@ -443,13 +445,22 @@ "processing/time-travel-queries" ] }, + { + "group": "Query data", + "pages": [ + "query/overview", + "query/query-with-select", + "query/query-from-visualization-tools", + "query/risingwave-as-postgres-fdw", + "query/subscription" + + ] + }, { "group": "Deliver data", "pages": [ "delivery/overview", - "delivery/supported-sink-connectors", - "delivery/risingwave-as-postgres-fdw", - "delivery/subscription" + "delivery/supported-sink-connectors" ] } ] diff --git a/query/overview.mdx b/query/overview.mdx new file mode 100644 index 00000000..8cd42393 --- /dev/null +++ b/query/overview.mdx @@ -0,0 +1,77 @@ +--- +title: "Query data in RisingWave" +description: "RisingWave allows you to access and use insights from your streaming data immediately. It also functions like any other database, allowing you to query batch or raw data that you've inserted." +sidebarTitle: Overview +--- + +This section explains how to query and interact with data in RisingWave. + +RisingWave offers several methods for serving results, catering to various use cases, from ad-hoc analysis to application integration. + +## Query with `SELECT` statements + +Retrieve data directly from RisingWave using standard SQL SELECT queries against tables or materialized views. Use this method for ad-hoc analysis, exploring the latest results, and extracting specific data subsets. + +For syntax details, see [`SELECT`](/sql/commands/sql-select). To learn how RisingWave processes data (ad-hoc or streaming), see [Ad-hoc vs. Streaming queries](/processing/overview#ad-hoc-on-read-vs-streaming-on-write). + +Connect to RisingWave with psql or any other PostgreSQL-compatible client to execute these queries. RisingWave is compatible with [many data visualization tools](/integrations/visualization/overview). Here are a few that we have tested: + +- [Beekeeper Studio](/integrations/visualization/beekeeper-studio) +- [DBeaver](/integrations/visualization/dbeaver) +- [Grafana](/integrations/visualization/grafana) +- [Looker](/integrations/visualization/looker) +- [Metabase](/integrations/visualization/metabase) +- [Superset](/integrations/visualization/superset) + +**Key features**: + +- Uses familiar SQL syntax. +- Provides immediate access to the most up-to-date results. +- Offers flexibility to filter, aggregate, and join data. + +**Example**: Retrieve the latest aggregated results from a materialized view. + +## Integrate with PostgreSQL via foreign data wrapper (FDW) + +RisingWave seamlessly integrates with existing PostgreSQL ecosystems through its Foreign Data Wrapper (FDW) functionality. Query data in RisingWave's tables and materialized views as if it were part of your PostgreSQL database. This allows you to leverage existing PostgreSQL tools and workflows. + +For details, see [RisingWave as Postgres FDW](/query/risingwave-as-postgres-fdw). + +**Key features**: + +- Enables unified querying across RisingWave and PostgreSQL data. +- Allows you to use existing PostgreSQL tools and applications. +- Simplifies integration into existing data infrastructure. +- Performance: While FDW offers convenience, it may introduce some performance overhead compared to directly querying RisingWave. RisingWave pushes down filters in `WHERE` clauses to optimize queries. However, complex queries with joins, aggregations, or `LIMIT` clauses are processed in PostgreSQL after fetching the data from RisingWave. + +**Example**: Join data in a PostgreSQL table with a continuously updated materialized view in RisingWave. + +## Subscribe to real-time updates + +RisingWave's subscription feature allows you to receive a continuous stream of updates from a materialized view directly, without needing an external message queue. This includes both existing data in the materialized view when the subscription is created and subsequent changes. You can choose to retrieve the full dataset or only incremental changes from a specific point using a subscription cursor. + +For details, see [Subscription](/query/subscription). + +**Key features**: + +- Provides real-time data updates directly from RisingWave. +- Allows retrieving full or incremental data using a cursor. +- Requires fewer components and less maintenance than external event stores. + +**Example**: Subscribe to a materialized view that tracks website user activity to power a live dashboard, receiving updates directly from RisingWave. + +## Access programmatically via SDK and client libraries + +RisingWave provides a [Python SDK](/python-sdk/intro) [`risingwave-py`](https://pypi.org/project/risingwave-py/) (currently in public preview) to help you develop event-driven applications. The SDK offers a simple way to perform ad-hoc queries, subscribe to changes, and define event handlers for tables and materialized views. + +Additionally, since RisingWave is compatible with Postgres, you can use standard PostgreSQL drivers to interact with RisingWave from your applications. + +Client libraries in various languages allow developers to interact with RisingWave programmatically and execute `SELECT` queries within their applications. + +For the list of available client libraries, see [Client Libraries](/client-libraries/overview). + +**Example**: Use the Python client library to fetch the latest results from a materialized view and display them in a financial data analysis application. + +## Choose the right method + +From the methods described above, select the one that best fits your needs, considering factors like query complexity, integration requirements, and team expertise. RisingWave ensures consistency across all methods. \ No newline at end of file diff --git a/query/query-from-visualization-tools.mdx b/query/query-from-visualization-tools.mdx new file mode 100644 index 00000000..291d2bc1 --- /dev/null +++ b/query/query-from-visualization-tools.mdx @@ -0,0 +1,4 @@ +--- +title: "Query from visualization tools" +url: "/integrations/visualization/overview" +--- diff --git a/query/query-with-select.mdx b/query/query-with-select.mdx new file mode 100644 index 00000000..bf0c4519 --- /dev/null +++ b/query/query-with-select.mdx @@ -0,0 +1,4 @@ +--- +title: "Query with SELECT statements" +url: "/sql/commands/sql-select" +--- diff --git a/delivery/risingwave-as-postgres-fdw.mdx b/query/risingwave-as-postgres-fdw.mdx similarity index 100% rename from delivery/risingwave-as-postgres-fdw.mdx rename to query/risingwave-as-postgres-fdw.mdx diff --git a/delivery/subscription.mdx b/query/subscription.mdx similarity index 100% rename from delivery/subscription.mdx rename to query/subscription.mdx From 6c0412d6c7ea51be38de9bcfc024c62af8dd4853 Mon Sep 17 00:00:00 2001 From: hengm3467 <100685635+hengm3467@users.noreply.github.com> Date: Fri, 27 Dec 2024 20:08:49 +0800 Subject: [PATCH 4/4] Update rw-premium-edition-intro.mdx (#163) --- get-started/rw-premium-edition-intro.mdx | 29 ++++++++++++++++-------- 1 file changed, 19 insertions(+), 10 deletions(-) diff --git a/get-started/rw-premium-edition-intro.mdx b/get-started/rw-premium-edition-intro.mdx index cb12b2e8..2ff06d6c 100644 --- a/get-started/rw-premium-edition-intro.mdx +++ b/get-started/rw-premium-edition-intro.mdx @@ -16,18 +16,12 @@ For RisingWave Cloud users, all Premium Edition features are available out of th ## Premium features -The premium features are carefully selected based on the following criteria: - -- Seamless integration with proprietary or licensed open-source systems. -- Advanced features that enhance development velocity and lower production deployment overhead. -- Performance improvements for non-standard deployment environments. -- Tailored features specifically requested by our paying customers. - The following are Premium Edition features, which include a "Premium Edition Feature" note in the documentation. ### SQL and security - +* [Time travel queries](/processing/time-travel-queries) +* [Secret management](/operate/manage-secrets) ### Schema management @@ -37,10 +31,25 @@ The following are Premium Edition features, which include a "Premium Edition Fea ### Connectors - +* [Sink to Snowflake](/integrations/destinations/snowflake) +* [Sink to DynamoDB](/integrations/destinations/amazon-dynamodb) +* [Sink to OpenSearch](/integrations/destinations/opensearch) +* [Sink to BigQuery](/integrations/destinations/bigquery) +* [Sink to SharedMergeTree table engine on ClickHouse Cloud](/integrations/destinations/clickhouse#supported-table-engines) +* [Sink to SQL Server](/integrations/destinations/sql-server) +* [Direct SQL Server CDC source connector](/integrations/sources/sql-server-cdc) +* [Sink to Iceberg with glue catalog](/integrations/destinations/apache-iceberg#glue-catelogs) +* [Ingest data from webhook](/integrations/sources/webhook) For users who are already using these features in 1.9.x or earlier versions, rest assured that the functionality of these features will be intact if you stay on the version. If you choose to upgrade to v2.0 or later versions, an error will show up to indicate you need a license to use the features. +The premium features are carefully selected based on the following criteria: + +- Seamless integration with proprietary or licensed open-source systems. +- Advanced features that enhance development velocity and lower production deployment overhead. +- Performance improvements for non-standard deployment environments. +- Tailored features specifically requested by our paying customers. + ## How to access Premium Edition features For RisingWave Cloud users, all Premium Edition features are available out of the box without additional cost. @@ -104,4 +113,4 @@ RisingWave provides three levels of support packages: ## Pricing -Pricing for RisingWave Premium will be based on the cluster size, measured in RisingWave Units (RWUs). The number of RWUs will be determined based on the scale of data ingestion, number of streaming jobs, and the complexity of use case. There could be additional factors as well. Please contact our sales at [sales@risingwave-labs.com](mailto:sales@risingwave-labs.com) for more details. +For pricing details, please contact our sales at [sales@risingwave-labs.com](mailto:sales@risingwave-labs.com). \ No newline at end of file