Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Connector-V2][Jdbc] Add oceanbase dialect factory #4989

Merged
merged 32 commits into from
Jul 11, 2023
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
aa129ef
[Feature][Connector-V2][OceanBase]OceanBase source and sink connector
silenceland Jan 15, 2023
49cd2b2
Merge branch 'dev' into feature/connector-V2-oceanbase
silenceland Jan 15, 2023
1dba48a
Merge branch 'dev' into feature/connector-V2-oceanbase
silenceland Mar 19, 2023
51b0901
[Feature][Connector-V2][OceanBase]mvn spotless:apply to fix violations
silenceland Mar 19, 2023
0830d39
Merge branch 'dev' into feature-connector-v2-oceanbase
silenceland Mar 25, 2023
273a9da
[Feature][Connector-V2][OceanBase]add oceanbase conf
silenceland Apr 16, 2023
a0c5c19
Merge branch 'dev' into feature-connector-v2-oceanbase
silenceland Apr 16, 2023
a18844f
[Feature][Connector-V2][OceanBase]fix column float or double
silenceland Apr 17, 2023
7e881a3
[Feature][Jdbc-Connector] Add OceanBase Connector
changhuyan Apr 20, 2023
6942728
[Feature][Jdbc-Connector]Add Oceanbase Connector
changhuyan Apr 20, 2023
fc2b53a
Merge branch 'dev' into OceanBase-Connector
changhuyan Apr 20, 2023
4feacd7
[Feature][Jdbc-Connector]
changhuyan Apr 21, 2023
c2b59a7
[Feature][Jdbc-Connector] Add java.util.*
changhuyan Apr 21, 2023
f007a2b
changhuyan [Feature][Jdbc-Connector] Modify ci and code style error
changhuyan Apr 23, 2023
e9c4289
[Feature][Jdbc-Connector] Add e2e testcase
changhuyan Apr 24, 2023
cd2171e
[Feature][Jdbc-Connector] Add e2etest License
changhuyan Apr 25, 2023
18a5772
Merge branch 'dev' into feature-connector-v2-oceanbase
silenceland Jun 7, 2023
9c11006
Merge branch 'dev' into feature-connector-v2-oceanbase
silenceland Jun 18, 2023
adcd203
[Feature][Connector-V2][OceanBase]spotless apply
silenceland Jun 18, 2023
d4b5952
fix conflicts
whhe Jun 28, 2023
6c930b6
Merge branch 'dev' into ob-dialect
whhe Jun 28, 2023
b23df0f
merge dialect factory class and revert unnecessary changes
whhe Jun 28, 2023
3ee6054
rename IT files
whhe Jun 28, 2023
3d8dd7f
update IT cases based on existing mysql/oracle cases
whhe Jun 28, 2023
7c70f04
add ob in docs
whhe Jun 28, 2023
168b1b3
disable obmysql IT case
whhe Jun 29, 2023
558e41b
use compatible_mode instead of driver_type and add ob doc
whhe Jul 3, 2023
50b1c3b
Merge branch 'dev' into ob-dialect
whhe Jul 7, 2023
a19e2a2
comments addressed
whhe Jul 7, 2023
6a4ae96
add compare result in ut
whhe Jul 7, 2023
465d9e5
set pulsar container timeout to 3 min
whhe Jul 7, 2023
854e9a9
fix comments
whhe Jul 10, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/en/connector-v2/sink/Jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ support `Xa transactions`. You can set `is_exactly_once=true` to enable it.
| user | String | No | - |
| password | String | No | - |
| query | String | No | - |
| compatible_mode | String | No | - |
| database | String | No | - |
| table | String | No | - |
| primary_keys | Array | No | - |
Expand Down Expand Up @@ -69,6 +70,10 @@ The URL of the JDBC connection. Refer to a case: jdbc:postgresql://localhost/tes

Use this sql write upstream input datas to database. e.g `INSERT ...`

### compatible_mode [string]

The compatible mode of database, required when the database supports multiple compatible modes. For example, when using OceanBase database, you need to set it to 'mysql' or 'oracle'.

### database [string]

Use this `database` and `table-name` auto-generate sql and receive upstream input datas write to database.
Expand Down Expand Up @@ -168,6 +173,7 @@ there are some reference value for params above.
| Redshift | com.amazon.redshift.jdbc42.Driver | jdbc:redshift://localhost:5439/testdb | com.amazon.redshift.xa.RedshiftXADataSource | https://mvnrepository.com/artifact/com.amazon.redshift/redshift-jdbc42 |
| Snowflake | net.snowflake.client.jdbc.SnowflakeDriver | jdbc:snowflake://<account_name>.snowflakecomputing.com | / | https://mvnrepository.com/artifact/net.snowflake/snowflake-jdbc |
| Vertica | com.vertica.jdbc.Driver | jdbc:vertica://localhost:5433 | / | https://repo1.maven.org/maven2/com/vertica/jdbc/vertica-jdbc/12.0.3-0/vertica-jdbc-12.0.3-0.jar |
| OceanBase | com.oceanbase.jdbc.Driver | jdbc:oceanbase://localhost:2881 | / | https://repo1.maven.org/maven2/com/oceanbase/oceanbase-client/2.4.3/oceanbase-client-2.4.3.jar |

## Example

Expand Down
186 changes: 186 additions & 0 deletions docs/en/connector-v2/sink/OceanBase.md

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions docs/en/connector-v2/source/Jdbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ supports query SQL and can achieve projection effect.
| user | String | No | - |
| password | String | No | - |
| query | String | Yes | - |
| compatible_mode | String | No | - |
| connection_check_timeout_sec | Int | No | 30 |
| partition_column | String | No | - |
| partition_upper_bound | Long | No | - |
Expand Down Expand Up @@ -63,6 +64,10 @@ The URL of the JDBC connection. Refer to a case: jdbc:postgresql://localhost/tes

Query statement

### compatible_mode [string]

The compatible mode of database, required when the database supports multiple compatible modes. For example, when using OceanBase database, you need to set it to 'mysql' or 'oracle'.

### connection_check_timeout_sec [int]

The time in seconds to wait for the database operation used to validate the connection to complete.
Expand Down Expand Up @@ -120,6 +125,7 @@ there are some reference value for params above.
| Snowflake | net.snowflake.client.jdbc.SnowflakeDriver | jdbc:snowflake://<account_name>.snowflakecomputing.com | https://mvnrepository.com/artifact/net.snowflake/snowflake-jdbc |
| Redshift | com.amazon.redshift.jdbc42.Driver | jdbc:redshift://localhost:5439/testdb?defaultRowFetchSize=1000 | https://mvnrepository.com/artifact/com.amazon.redshift/redshift-jdbc42 |
| Vertica | com.vertica.jdbc.Driver | jdbc:vertica://localhost:5433 | https://repo1.maven.org/maven2/com/vertica/jdbc/vertica-jdbc/12.0.3-0/vertica-jdbc-12.0.3-0.jar |
| OceanBase | com.oceanbase.jdbc.Driver | jdbc:oceanbase://localhost:2881 | https://repo1.maven.org/maven2/com/oceanbase/oceanbase-client/2.4.3/oceanbase-client-2.4.3.jar |

## Example

Expand Down
168 changes: 168 additions & 0 deletions docs/en/connector-v2/source/OceanBase.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# OceanBase

> JDBC OceanBase Source Connector

## Support Those Engines

> Spark<br/>
> Flink<br/>
> Seatunnel Zeta<br/>

## Key Features

- [x] [batch](../../concept/connector-v2-features.md)
- [ ] [stream](../../concept/connector-v2-features.md)
- [x] [exactly-once](../../concept/connector-v2-features.md)
- [x] [column projection](../../concept/connector-v2-features.md)
- [x] [parallelism](../../concept/connector-v2-features.md)
- [x] [support user-defined split](../../concept/connector-v2-features.md)

## Description

Read external data source data through JDBC.

## Supported DataSource Info

| Datasource | Supported versions | Driver | Url | Maven |
|------------|--------------------------------|---------------------------|--------------------------------------|-------------------------------------------------------------------------------|
| OceanBase | All OceanBase server versions. | com.oceanbase.jdbc.Driver | jdbc:oceanbase://localhost:2883/test | [Download](https://mvnrepository.com/artifact/com.oceanbase/oceanbase-client) |

## Database Dependency

> Please download the support list corresponding to 'Maven' and copy it to the '$SEATNUNNEL_HOME/plugins/jdbc/lib/' working directory<br/>
> For example: cp oceanbase-client-xxx.jar $SEATNUNNEL_HOME/plugins/jdbc/lib/

## Data Type Mapping

### Mysql Mode

| Mysql Data type | Seatunnel Data type |
|-----------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------|
| BIT(1)<br/>INT UNSIGNED | BOOLEAN |
| TINYINT<br/>TINYINT UNSIGNED<br/>SMALLINT<br/>SMALLINT UNSIGNED<br/>MEDIUMINT<br/>MEDIUMINT UNSIGNED<br/>INT<br/>INTEGER<br/>YEAR | INT |
| INT UNSIGNED<br/>INTEGER UNSIGNED<br/>BIGINT | BIGINT |
| BIGINT UNSIGNED | DECIMAL(20,0) |
| DECIMAL(x,y)(Get the designated column's specified column size.<38) | DECIMAL(x,y) |
| DECIMAL(x,y)(Get the designated column's specified column size.>38) | DECIMAL(38,18) |
| DECIMAL UNSIGNED | DECIMAL((Get the designated column's specified column size)+1,<br/>(Gets the designated column's number of digits to right of the decimal point.))) |
| FLOAT<br/>FLOAT UNSIGNED | FLOAT |
| DOUBLE<br/>DOUBLE UNSIGNED | DOUBLE |
| CHAR<br/>VARCHAR<br/>TINYTEXT<br/>MEDIUMTEXT<br/>TEXT<br/>LONGTEXT<br/>JSON | STRING |
| DATE | DATE |
| TIME | TIME |
| DATETIME<br/>TIMESTAMP | TIMESTAMP |
| TINYBLOB<br/>MEDIUMBLOB<br/>BLOB<br/>LONGBLOB<br/>BINARY<br/>VARBINAR<br/>BIT(n) | BYTES |
| GEOMETRY<br/>UNKNOWN | Not supported yet |

### Oracle Mode

| Oracle Data type | Seatunnel Data type |
|-----------------------------------------------------------|---------------------|
| Number(p), p <= 9 | INT |
| Number(p), p <= 18 | BIGINT |
| Number(p), p > 18 | DECIMAL(38,18) |
| REAL<br/> BINARY_FLOAT | FLOAT |
| BINARY_DOUBLE | DOUBLE |
| CHAR<br/>NCHAR<br/>NVARCHAR2<br/>NCLOB<br/>CLOB<br/>ROWID | STRING |
| DATE | DATE |
| TIMESTAMP<br/>TIMESTAMP WITH LOCAL TIME ZONE | TIMESTAMP |
| BLOB<br/>RAW<br/>LONG RAW<br/>BFILE | BYTES |
| UNKNOWN | Not supported yet |

## Source Options

| Name | Type | Required | Default | Description |
|------------------------------|--------|----------|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| url | String | Yes | - | The URL of the JDBC connection. Refer to a case: jdbc:oceanbase://localhost:2883/test |
| driver | String | Yes | - | The jdbc class name used to connect to the remote data source, should be `com.oceanbase.jdbc.Driver`. |
| user | String | No | - | Connection instance user name |
| password | String | No | - | Connection instance password |
| compatible_mode | String | Yes | - | The compatible mode of OceanBase, can be 'mysql' or 'oracle'. |
| query | String | Yes | - | Query statement |
| connection_check_timeout_sec | Int | No | 30 | The time in seconds to wait for the database operation used to validate the connection to complete |
| partition_column | String | No | - | The column name for parallelism's partition, only support numeric type column and string type column. |
| partition_lower_bound | Long | No | - | The partition_column min value for scan, if not set SeaTunnel will query database get min value. |
| partition_upper_bound | Long | No | - | The partition_column max value for scan, if not set SeaTunnel will query database get max value. |
| partition_num | Int | No | job parallelism | The number of partition count, only support positive integer. Default value is job parallelism. |
| fetch_size | Int | No | 0 | For queries that return a large number of objects, you can configure <br/> the row fetch size used in the query to improve performance by <br/> reducing the number database hits required to satisfy the selection criteria.<br/> Zero means use jdbc default value. |
| common-options | | No | - | Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details |

### Tips

> If partition_column is not set, it will run in single concurrency, and if partition_column is set, it will be executed in parallel according to the concurrency of tasks.

## Task Example

### Simple:

```
env {
execution.parallelism = 2
job.mode = "BATCH"
}

source {
Jdbc {
driver = "com.oceanbase.jdbc.Driver"
url = "jdbc:oceanbase://localhost:2883/test?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true"
user = "root"
password = ""
compatible_mode = "mysql"
query = "select * from source"
}
}

transform {
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
# please go to https://seatunnel.apache.org/docs/transform/sql
}

sink {
Console {}
}
```

### Parallel:

> Read your query table in parallel with the shard field you configured and the shard data. You can do this if you want to read the whole table

```
source {
Jdbc {
driver = "com.oceanbase.jdbc.Driver"
url = "jdbc:oceanbase://localhost:2883/test?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true"
user = "root"
password = ""
compatible_mode = "mysql"
query = "select * from source"
# Parallel sharding reads fields
partition_column = "id"
# Number of fragments
partition_num = 10
}
}
```

### Parallel Boundary:

> It is more efficient to read your data source according to the upper and lower boundaries you configured

```
source {
Jdbc {
driver = "com.oceanbase.jdbc.Driver"
url = "jdbc:oceanbase://localhost:2883/test?useUnicode=true&characterEncoding=UTF-8&rewriteBatchedStatements=true"
user = "root"
password = ""
compatible_mode = "mysql"
query = "select * from source"
partition_column = "id"
partition_num = 10
# Read start boundary
partition_lower_bound = 1
# Read end boundary
partition_upper_bound = 500
}
}
```

Loading
Loading