Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding upsert documentation azure database for Azure PostgreSQL Connector #126073

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 49 additions & 1 deletion articles/data-factory/connector-azure-database-for-postgresql.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,10 @@ To copy data to Azure Database for PostgreSQL, the following properties are supp
|:--- |:--- |:--- |
| type | The type property of the copy activity sink must be set to **AzurePostgreSQLSink**. | Yes |
| preCopyScript | Specify a SQL query for the copy activity to execute before you write data into Azure Database for PostgreSQL in each run. You can use this property to clean up the preloaded data. | No |
| writeMethod | The method used to write data into Azure Database for PostgreSQL.<br>Allowed values are: **CopyCommand** (default, which is more performant), **BulkInsert**. | No |
| writeMethod | The method used to write data into Azure Database for PostgreSQL.<br>Allowed values are: **CopyCommand** (default, which is more performant), **BulkInsert** and **Upsert**. | No |
| upsertSettings | Specify the group of the settings for write behavior. <br/> Apply when the WriteBehavior option is `Upsert`. | No |
| ***Under `upsertSettings`:*** | | |
| keys | Specify the column names for unique row identification. Either a single key or a series of keys can be used. | No |
| writeBatchSize | The number of rows loaded into Azure Database for PostgreSQL per batch.<br>Allowed value is an integer that represents the number of rows. | No (default is 1,000,000) |
| writeBatchTimeout | Wait time for the batch insert operation to complete before it times out.<br>Allowed values are Timespan strings. An example is 00:30:00 (30 minutes). | No (default is 00:30:00) |

Expand Down Expand Up @@ -248,6 +251,51 @@ To copy data to Azure Database for PostgreSQL, the following properties are supp
]
```

**Example 2: Upsert data**

```json
"activities":[
{
"name": "CopyToAzureDatabaseForPostgreSQL",
"type": "Copy",
"inputs": [
{
"referenceName": "<input dataset name>",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "<Azure PostgreSQL output dataset name>",
"type": "DatasetReference"
}
],
"typeProperties": {
"source": {
"type": "<source type>"
},
"sink": {
"type": "AzurePostgreSQLSink",
"writeMethod": "Upsert",
"upsertSettings": {
"keys": [
"<column name>"
]
},
}
}
}
]
```

### Append data

Appending data is the default behavior of this Azure Database for PostgreSQL sink connector. The service does a bulk insert to write to your table efficiently. You can configure the source and sink accordingly in the copy activity.

### Upsert data

Copy activity now supports natively upsert, it will update the data in sink table if key exists and otherwise insert new data. To upsert data it is required to provide a set of columns that either have to be primary key or unique column.

## Parallel copy from Azure Database for PostgreSQL

The Azure Database for PostgreSQL connector in copy activity provides built-in data partitioning to copy data in parallel. You can find data partitioning options on the **Source** tab of the copy activity.
Expand Down