-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reformating and enhancing the RosettaDB documentation, initial version
- Loading branch information
Showing
15 changed files
with
772 additions
and
728 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
#### apply | ||
Gets current model and compares with state of database, generates ddl for changes and applies to database. If you set `git_auto_commit` to `true` in `main.conf` it will automatically push the new model to your Git repo of the rosetta project. | ||
|
||
rosetta [-c, --config CONFIG_FILE] apply [-h, --help] [-s, --source CONNECTION_NAME] | ||
|
||
Parameter | Description | ||
--- | --- | ||
-h, --help | Show the help message and exit. | ||
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. | ||
-s, --source CONNECTION_NAME | The source connection is used to specify which models and connection to use. | ||
-m, --model MODEL_FILE (Optional) | The model file to use for apply. Default is `model.yaml` | ||
|
||
|
||
Example: | ||
|
||
(Actual database) | ||
```yaml | ||
--- | ||
safeMode: false | ||
databaseType: "mysql" | ||
operationLevel: database | ||
tables: | ||
- name: "actor" | ||
type: "TABLE" | ||
columns: | ||
- name: "actor_id" | ||
typeName: "SMALLINT UNSIGNED" | ||
ordinalPosition: 0 | ||
primaryKeySequenceId: 1 | ||
columnDisplaySize: 5 | ||
scale: 0 | ||
precision: 5 | ||
nullable: false | ||
primaryKey: true | ||
autoincrement: false | ||
tests: | ||
assertion: | ||
- operator: '=' | ||
value: 16 | ||
expected: 1 | ||
``` | ||
(Expected database) | ||
```yaml | ||
--- | ||
safeMode: false | ||
databaseType: "mysql" | ||
operationLevel: database | ||
tables: | ||
- name: "actor" | ||
type: "TABLE" | ||
columns: | ||
- name: "actor_id" | ||
typeName: "SMALLINT UNSIGNED" | ||
ordinalPosition: 0 | ||
primaryKeySequenceId: 1 | ||
columnDisplaySize: 5 | ||
scale: 0 | ||
precision: 5 | ||
nullable: false | ||
primaryKey: true | ||
autoincrement: false | ||
tests: | ||
assertion: | ||
- operator: '=' | ||
value: 16 | ||
expected: 1 | ||
- name: "first_name" | ||
typeName: "VARCHAR" | ||
ordinalPosition: 0 | ||
primaryKeySequenceId: 0 | ||
columnDisplaySize: 45 | ||
scale: 0 | ||
precision: 45 | ||
nullable: false | ||
primaryKey: false | ||
autoincrement: false | ||
tests: | ||
assertion: | ||
- operator: '!=' | ||
value: 'Michael' | ||
expected: 1 | ||
``` | ||
Description: Our actual database does not contain `first_name` so we expect it to alter the table and add the column, inside the source directory there will be the executed DDL and a snapshot of the current database. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
#### compile | ||
This command generates a DDL for a target database based on the source DBML which was generated by the previous command (`extract`). | ||
|
||
rosetta [-c, --config CONFIG_FILE] compile [-h, --help] [-t, --target CONNECTION_NAME] [-s, --source CONNECTION_NAME] | ||
|
||
Parameter | Description | ||
--- | --- | ||
-h, --help | Show the help message and exit. | ||
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. | ||
-s, --source CONNECTION_NAME (Optional) | The source connection name where models are generated. | ||
-t, --target CONNECTION_NAME | The target connection name in which source DBML converts to. | ||
-d, --with-drop | Add query to drop tables when generating ddl. | ||
|
||
Example: | ||
```yaml | ||
CREATE SCHEMA breathe; | ||
CREATE TABLE breathe.profiles(id INTEGER not null AUTO_INCREMENT, name STRING not null); | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
#### dbt | ||
This is the command that generates dbt models for a source DBML which was generated by the previous command (`extract`). | ||
|
||
rosetta [-c, --config CONFIG_FILE] dbt [-h, --help] [-s, --source CONNECTION_NAME] | ||
|
||
Parameter | Description | ||
--- | --- | ||
-h, --help | Show the help message and exit. | ||
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. | ||
-s, --source CONNECTION_NAME | The source connection name where models are generated. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
#### diff | ||
Show the difference between the local model and the database. Check if any table is removed, or added or if any columns have changed. | ||
|
||
rosetta [-c, --config CONFIG_FILE] diff [-h, --help] [-s, --source CONNECTION_NAME] | ||
|
||
Parameter | Description | ||
--- | --- | ||
-h, --help | Show the help message and exit. | ||
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. | ||
-s, --source CONNECTION_NAME | The source connection is used to specify which models and connection to use. | ||
-m, --model MODEL_FILE (Optional) | The model file to use for apply. Default is `model.yaml` | ||
|
||
|
||
Example: | ||
``` | ||
There are changes between local model and targeted source | ||
Table Changed: Table 'actor' columns changed | ||
Column Changed: Column 'actor_id' in table 'actor' changed 'Precision'. New value: '1', old value: '5' | ||
Column Changed: Column 'actor_id' in table 'actor' changed 'Autoincrement'. New value: 'true', old value: 'false' | ||
Column Changed: Column 'actor_id' in table 'actor' changed 'Primary key'. New value: 'false', old value: 'true' | ||
Column Changed: Column 'actor_id' in table 'actor' changed 'Nullable'. New value: 'true', old value: 'false' | ||
Table Added: Table 'address' | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
## Downloading Drivers | ||
You need the JDBC drivers to connect to the sources/targets that you will use with the rosetta tool. | ||
The JDBC drivers for the rosetta supported databases can be downloaded from the following URLs: | ||
|
||
- [BigQuery JDBC 4.2](https://storage.googleapis.com/simba-bq-release/jdbc/SimbaJDBCDriverforGoogleBigQuery42_1.3.0.1001.zip) | ||
- [Snowflake JDBC 3.13.19](https://repo1.maven.org/maven2/net/snowflake/snowflake-jdbc/3.13.19/snowflake-jdbc-3.13.19.jar) | ||
- [Postgresql JDBC 42.3.7](https://jdbc.postgresql.org/download/postgresql-42.3.7.jar) | ||
- [MySQL JDBC 8.0.30](https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.30.zip) | ||
- [Kinetica JDBC 7.1.7.7](https://github.com/kineticadb/kinetica-client-jdbc/archive/refs/tags/v7.1.7.7.zip) | ||
- [Google Cloud Spanner JDBC 2.6.2](https://search.maven.org/remotecontent?filepath=com/google/cloud/google-cloud-spanner-jdbc/2.6.2/google-cloud-spanner-jdbc-2.6.2-single-jar-with-dependencies.jar) | ||
- [SQL Server JDBC 12.2.0](https://go.microsoft.com/fwlink/?linkid=2223050) | ||
- [DB2 JDBC jcc4](https://repo1.maven.org/maven2/com/ibm/db2/jcc/db2jcc/db2jcc4/db2jcc-db2jcc4.jar) | ||
- [Oracle JDBC 23.2.0.0](https://download.oracle.com/otn-pub/otn_software/jdbc/232-DeveloperRel/ojdbc11.jar) | ||
|
||
### Example connection string configurations for databases | ||
|
||
### BigQuery (service-based authentication OAuth 0) | ||
``` | ||
url: jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<PROJECT-ID>;AdditionalProjects=bigquery-public-data;OAuthType=0;OAuthServiceAcctEmail=<EMAIL>;OAuthPvtKeyPath=<SERVICE-ACCOUNT-PATH> | ||
``` | ||
|
||
### BigQuery (pre-generated token authentication OAuth 2) | ||
``` | ||
jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;OAuthType=2;ProjectId=<PROJECT-ID>;OAuthAccessToken=<ACCESS-TOKEN>;OAuthRefreshToken=<REFRESH-TOKEN>;OAuthClientId=<CLIENT-ID>;OAuthClientSecret=<CLIENT-SECRET>; | ||
``` | ||
|
||
### BigQuery (application default credentials authentication OAuth 3) | ||
``` | ||
jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;OAuthType=3;ProjectId=<PROJECT-ID>; | ||
``` | ||
|
||
### Snowflake | ||
``` | ||
url: jdbc:snowflake://<HOST>:443/?db=<DATABASE>&user=<USER>&password=<PASSWORD> | ||
``` | ||
|
||
### PostgreSQL | ||
``` | ||
url: jdbc:postgresql://<HOST>:5432/<DATABASE>?user=<USER>&password=<PASSWORD> | ||
``` | ||
|
||
### MySQL | ||
``` | ||
url: jdbc:mysql://<USER>:<PASSWORD>@<HOST>:3306/<DATABASE> | ||
``` | ||
|
||
### Kinetica | ||
``` | ||
url: jdbc:kinetica:URL=http://<HOST>:9191;CombinePrepareAndExecute=1 | ||
``` | ||
|
||
### Google Cloud Spanner | ||
``` | ||
url: jdbc:cloudspanner:/projects/my-project/instances/my-instance/databases/my-db;credentials=/path/to/credentials.json | ||
``` | ||
|
||
### Google CLoud Spanner (Emulator) | ||
``` | ||
url: jdbc:cloudspanner://localhost:9010/projects/test/instances/test/databases/test?autoConfigEmulator=true | ||
``` | ||
|
||
### SQL Server | ||
``` | ||
url: jdbc:sqlserver://<HOST>:1433;databaseName=<DATABASE> | ||
``` | ||
|
||
### DB2 | ||
``` | ||
url: jdbc:db2://<HOST>:50000;<DATABASE> | ||
``` | ||
|
||
### ORACLE | ||
``` | ||
url: jdbc:oracle:thin:<HOST>:1521:<SID> | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
#### drivers | ||
This command can list drivers that are listed in a `drivers.yaml` file and by choosing a driver you can download it to the `ROSETTA_DRIVERS` directory which will be automatically ready to use. | ||
|
||
rosetta drivers [-h, --help] [-f, --file] [--list] <indexToDownload> [-dl, --download] | ||
|
||
Parameter | Description | ||
--- | --- | ||
-h, --help | Show the help message and exit. | ||
-f, --file DRIVERS_FILE | YAML drivers file path. If none is supplied it will use drivers.yaml in the current directory and then fallback to our default one. | ||
--list | Used to list all available drivers. | ||
-dl, --download | Used to download selected driver by index. | ||
indexToDownload | Chooses which driver to download depending on the index of the driver. | ||
|
||
|
||
***Example*** (drivers.yaml) | ||
|
||
```yaml | ||
- name: MySQL 8.0.30 | ||
link: https://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-8.0.30.zip | ||
- name: Postgresql 42.3.7 | ||
link: https://jdbc.postgresql.org/download/postgresql-42.3.7.jar | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
#### extract | ||
This is the command that extracts the schema from a database and generates declarative DBML models that can be used for conversion to alternate database targets. | ||
|
||
rosetta [-c, --config CONFIG_FILE] extract [-h, --help] [-s, --source CONNECTION_NAME] [-t, --convert-to CONNECTION_NAME] | ||
|
||
Parameter | Description | ||
--- | --- | ||
-h, --help | Show the help message and exit. | ||
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. | ||
-s, --source CONNECTION_NAME | The source connection name to extract schema from. | ||
-t, --convert-to CONNECTION_NAME (Optional) | The target connection name in which source DBML converts to. | ||
|
||
Example: | ||
```yaml | ||
--- | ||
safeMode: false | ||
databaseType: bigquery | ||
operationLevel: database | ||
tables: | ||
- name: "profiles" | ||
type: "TABLE" | ||
schema: "breathe" | ||
columns: | ||
- name: "id" | ||
typeName: "INT64" | ||
jdbcDataType: "4" | ||
ordinalPosition: 0 | ||
primaryKeySequenceId: 1 | ||
columnDisplaySize: 10 | ||
scale: 0 | ||
precision: 10 | ||
primaryKey: false | ||
nullable: false | ||
autoincrement: true | ||
- name: "name" | ||
typeName: "STRING" | ||
jdbcDataType: "12" | ||
ordinalPosition: 0 | ||
primaryKeySequenceId: 0 | ||
columnDisplaySize: 255 | ||
scale: 0 | ||
precision: 255 | ||
primaryKey: false | ||
nullable: false | ||
autoincrement: false | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
#### generate | ||
This command will generate Spark Python (file) or Spark Scala (file), firstly it extracts a schema from a source database and gets connection properties from the source connection, then it creates a python (file) or scala (file) that translates schemas, which is ready to transfer data from source to target. | ||
|
||
rosetta [-c, --config CONFIG_FILE] generate [-h, --help] [-s, --source CONNECTION_NAME] [-t, --target CONNECTION_NAME] [--pyspark] [--scala] | ||
|
||
Parameter | Description | ||
--- | --- | ||
-h, --help | Show the help message and exit. | ||
-c, --config CONFIG_FILE | YAML config file. If none is supplied it will use main.conf in the current directory if it exists. | ||
-s, --source CONNECTION_NAME | The source connection name to extract schema from. | ||
-t, --target CONNECTION_NAME| The target connection name where the data will be transfered. | ||
--pyspark | Generates the Spark SQL file. | ||
--scala | Generates the Scala SQL file. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
#### init | ||
This command will generate a project (directory) if specified, a default configuration file located in the current directory with example connections for `bigquery` and `snowflake`, and the model directory. | ||
|
||
rosetta init [PROJECT_NAME] | ||
|
||
Parameter | Description | ||
--- | --- | ||
(Optional) PROJECT_NAME | Project name (directory) where the configuration file and model directory will be created. | ||
|
||
Example: | ||
```yaml | ||
#example with 2 connections | ||
connections: | ||
- name: snowflake_weather_prod | ||
databaseName: SNOWFLAKE_SAMPLE_DATA | ||
schemaName: WEATHER | ||
dbType: snowflake | ||
url: jdbc:snowflake://<account_identifier>.snowflakecomputing.com/?<connection_params> | ||
userName: bob | ||
password: bobPassword | ||
- name: bigquery_prod | ||
databaseName: bigquery-public-data | ||
schemaName: breathe | ||
dbType: bigquery | ||
url: jdbc:bigquery://[Host]:[Port];ProjectId=[Project];OAuthType= [AuthValue];[Property1]=[Value1];[Property2]=[Value2];... | ||
userName: user | ||
password: password | ||
tables: | ||
- bigquery_table | ||
``` |
Oops, something went wrong.