Skip to content

Commit

Permalink
Removed unsupported Azure Blob storage via import into (#15859) (#15891)
Browse files Browse the repository at this point in the history
  • Loading branch information
ti-chi-bot authored Dec 27, 2023
1 parent 1791116 commit 0981356
Show file tree
Hide file tree
Showing 4 changed files with 7 additions and 20 deletions.
8 changes: 1 addition & 7 deletions external-storage-uri.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,14 +79,8 @@ gcs://external/test.csv?credentials-file=${credentials-file-path}
- `encryption-scope`: Specifies the [encryption scope](https://learn.microsoft.com/en-us/azure/storage/blobs/encryption-scope-manage?tabs=powershell#upload-a-blob-with-an-encryption-scope) for server-side encryption.
- `encryption-key`: Specifies the [encryption key](https://learn.microsoft.com/en-us/azure/storage/blobs/encryption-customer-provided-keys) for server-side encryption, which uses the AES256 encryption algorithm.
The following is an example of an Azure Blob Storage URI for TiDB Lightning and BR. In this example, you need to specify a specific file path `testfolder`.
The following is an example of an Azure Blob Storage URI for BR. In this example, you need to specify a specific file path `testfolder`.
```shell
azure://external/testfolder?account-name=${account-name}&account-key=${account-key}
```
The following is an example of an Azure Blob Storage URI for [`IMPORT INTO`](/sql-statements/sql-statement-import-into.md). In this example, you need to specify a specific filename `test.csv`.
```shell
azure://external/test.csv?account-name=${account-name}&account-key=${account-key}
```
16 changes: 5 additions & 11 deletions sql-statements/sql-statement-import-into.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ This TiDB statement is not applicable to TiDB Cloud.
`IMPORT INTO` supports importing data from files stored in Amazon S3, GCS, and the TiDB local storage.

- For data files stored in Amazon S3, GCS, or Azure Blob Storage, `IMPORT INTO` supports running in the [TiDB backend task distributed execution framework](/tidb-distributed-execution-framework.md).
- For data files stored in Amazon S3 or GCS, `IMPORT INTO` supports running in the [TiDB Distributed eXecution Framework (DXF)](/tidb-distributed-execution-framework.md).

- When this framework is enabled ([tidb_enable_dist_task](/system-variables.md#tidb_enable_dist_task-new-in-v710) is `ON`), `IMPORT INTO` splits a data import job into multiple sub-jobs and distributes these sub-jobs to different TiDB nodes for execution to improve the import efficiency.
- When this framework is disabled, `IMPORT INTO` only supports running on the TiDB node where the current user is connected.
Expand Down Expand Up @@ -97,9 +97,9 @@ In the left side of the `SET` expression, you can only reference a column name t

### fileLocation

It specifies the storage location of the data file, which can be an Amazon S3, GCS, or Azure Blob Storage URI path, or a TiDB local file path.
It specifies the storage location of the data file, which can be an Amazon S3 or GCS URI path, or a TiDB local file path.

- Amazon S3, GCS, or Azure Blob Storage URI path: for URI configuration details, see [URI Formats of External Storage Services](/external-storage-uri.md).
- Amazon S3 or GCS URI path: for URI configuration details, see [URI Formats of External Storage Services](/external-storage-uri.md).
- TiDB local file path: it must be an absolute path, and the file extension must be `.csv`, `.sql`, or `.parquet`. Make sure that the files corresponding to this path are stored on the TiDB node connected by the current user, and the user has the `FILE` privilege.

> **Note:**
Expand Down Expand Up @@ -210,7 +210,7 @@ Assume that there are three files named `file-01.csv`, `file-02.csv`, and `file-
IMPORT INTO t FROM '/path/to/file-*.csv'
```

### Import data files from Amazon S3, GCS, or Azure Blob Storage
### Import data files from Amazon S3 or GCS

- Import data files from Amazon S3:

Expand All @@ -224,13 +224,7 @@ IMPORT INTO t FROM '/path/to/file-*.csv'
IMPORT INTO t FROM 'gs://import/test.csv?credentials-file=${credentials-file-path}';
```

- Import data files from Azure Blob Storage:

```sql
IMPORT INTO t FROM 'azure://import/test.csv?credentials-file=${credentials-file-path}';
```

For details about the URI path configuration for Amazon S3, GCS, or Azure Blob Storage, see [URI Formats of External Storage Services](/external-storage-uri.md).
For details about the URI path configuration for Amazon S3 or GCS, see [URI Formats of External Storage Services](/external-storage-uri.md).

### Calculate column values using SetClause

Expand Down
1 change: 0 additions & 1 deletion tidb-lightning/tidb-lightning-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ TiDB Lightning can read data from the following sources:
- Local
- [Amazon S3](/external-storage-uri.md#amazon-s3-uri-format)
- [Google Cloud Storage](/external-storage-uri.md#gcs-uri-format)
- [Azure Blob Storage](/external-storage-uri.md#azure-blob-storage-uri-format)

## TiDB Lightning architecture

Expand Down
2 changes: 1 addition & 1 deletion tidb-lightning/tidb-lightning-physical-import-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ It is recommended that you allocate CPU more than 32 cores and memory greater th
- Do not use multiple TiDB Lightning instances to import data to the same TiDB cluster by default. Use [Parallel Import](/tidb-lightning/tidb-lightning-distributed-import.md) instead.
- When you use multiple TiDB Lightning to import data to the same target cluster, do not mix the import modes. That is, do not use the physical import mode and the logical import mode at the same time.
- During the process of importing data, do not perform DDL and DML operations in the target table. Otherwise the import will fail or the data will be inconsistent. At the same time, it is not recommended to perform read operations, because the data you read might be inconsistent. You can perform read and write operations after the import operation is completed.
- A single Lightning process can import a single table of 10 TB at most. Parallel import can use 10 Lightning instances at most.
- A single Lightning process can import a single table of 10 TiB at most. Parallel import can use 10 Lightning instances at most.

### Tips for using with other components

Expand Down

0 comments on commit 0981356

Please sign in to comment.