Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud: update export doc #18605

Closed
wants to merge 40 commits into from
Closed
Show file tree
Hide file tree
Changes from 38 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
821ebde
update export doc
shiyuhang0 Aug 15, 2024
321f7d3
update export doc
shiyuhang0 Aug 16, 2024
fd166b1
add serverless external storage
shiyuhang0 Aug 20, 2024
8809133
some opt
shiyuhang0 Aug 20, 2024
e2f9a3e
update media
shiyuhang0 Aug 20, 2024
3bc123c
fix lint
shiyuhang0 Aug 22, 2024
34a0553
update title for TiDB Dedicated
hfxsd Aug 26, 2024
31b119d
Update tidb-cloud/serverless-external-storage.md
shiyuhang0 Aug 26, 2024
5758558
Apply suggestions from code review
shiyuhang0 Aug 26, 2024
17757ec
Apply suggestions from code review
shiyuhang0 Aug 27, 2024
68e7321
Update serverless-export.md
hfxsd Aug 27, 2024
c6c44c4
Apply suggestions from code review
hfxsd Aug 27, 2024
8385bbe
add role arn
shiyuhang0 Aug 27, 2024
1a8bfbb
Apply suggestions from code review
hfxsd Aug 27, 2024
d18be3f
Update serverless-external-storage.md
hfxsd Aug 27, 2024
06341eb
Update serverless-external-storage.md
hfxsd Aug 27, 2024
4abfc72
remove role arn
shiyuhang0 Aug 27, 2024
8358d74
Apply suggestions from code review
shiyuhang0 Aug 27, 2024
1e17e69
Apply suggestions from code review
hfxsd Aug 28, 2024
8973d8a
Apply suggestions from code review
hfxsd Aug 29, 2024
214e281
Apply suggestions from code review
hfxsd Aug 29, 2024
6d4930d
Update tidb-cloud/serverless-export.md
hfxsd Aug 29, 2024
42ceb61
fix lint
shiyuhang0 Aug 29, 2024
e29242b
Apply suggestions from code review
shiyuhang0 Aug 29, 2024
8bbeaad
opt
shiyuhang0 Aug 29, 2024
fe9eede
Apply suggestions from code review
hfxsd Aug 29, 2024
3b391b3
opt
shiyuhang0 Aug 29, 2024
d5d0717
Apply suggestions from code review
hfxsd Aug 29, 2024
f6aadb2
Apply suggestions from code review
hfxsd Aug 29, 2024
32fa6f2
Apply suggestions from code review
hfxsd Aug 30, 2024
f7bc7bf
opt pic
shiyuhang0 Aug 30, 2024
69b8c74
opt pic
shiyuhang0 Aug 30, 2024
f860d11
Update tidb-cloud/serverless-external-storage.md
hfxsd Aug 30, 2024
1a00c97
opt
shiyuhang0 Aug 30, 2024
f428ed8
Update config-s3-and-gcs-access.md
hfxsd Sep 2, 2024
b66e4f2
Apply suggestions from code review
hfxsd Sep 2, 2024
4c9657a
Update tidb-cloud/serverless-external-storage.md
hfxsd Sep 2, 2024
a44b530
Apply suggestions from code review
hfxsd Sep 2, 2024
eee6614
Apply suggestions from code review
hfxsd Sep 2, 2024
c2ca7ff
Apply suggestions from code review
hfxsd Sep 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion TOC-tidb-cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,8 @@
- [Import Apache Parquet Files from Amazon S3 or GCS](/tidb-cloud/import-parquet-files.md)
- [Import with MySQL CLI](/tidb-cloud/import-with-mysql-cli.md)
- Reference
- [Configure Amazon S3 Access and GCS Access](/tidb-cloud/config-s3-and-gcs-access.md)
- [Configure External Storage Access for TiDB Dedicated](/tidb-cloud/config-s3-and-gcs-access.md)
- [Configure External Storage Access for TiDB Serverless](/tidb-cloud/serverless-external-storage.md)
- [Naming Conventions for Data Import](/tidb-cloud/naming-conventions-for-data-import.md)
- [CSV Configurations for Importing Data](/tidb-cloud/csv-config-for-import-data.md)
- [Troubleshoot Access Denied Errors during Data Import from Amazon S3](/tidb-cloud/troubleshoot-import-access-denied-error.md)
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 8 additions & 6 deletions tidb-cloud/config-s3-and-gcs-access.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
---
title: Configure Amazon S3 Access and GCS Access
title: Configure External Storage Access for TiDB Dedicated
summary: Learn how to configure Amazon Simple Storage Service (Amazon S3) access and Google Cloud Storage (GCS) access.
---

# Configure Amazon S3 Access and GCS Access
# Configure External Storage Access for TiDB Dedicated

If your source data is stored in Amazon S3 or Google Cloud Storage (GCS) buckets, before importing or migrating the data to TiDB Cloud, you need to configure cross-account access to the buckets. This document describes how to do this.
If your source data is stored in Amazon S3 or Google Cloud Storage (GCS) buckets, before importing or migrating the data to TiDB Cloud, you need to configure cross-account access to the buckets. This document describes how to do this for TiDB Dedicated clusters.

If you need to configure these external storages for TiDB Serverless clusters, see [Configure External Storage Access for TiDB Serverless](/tidb-cloud/serverless-external-storage.md).

## Configure Amazon S3 access

Expand Down Expand Up @@ -98,9 +100,9 @@ Configure the bucket access for TiDB Cloud and get the Role ARN as follows:

If the objects in your bucket have been copied from another encrypted bucket, the KMS key value needs to include the keys of both buckets. For example, `"Resource": ["arn:aws:kms:ap-northeast-1:105880447796:key/c3046e91-fdfc-4f3a-acff-00597dd3801f","arn:aws:kms:ap-northeast-1:495580073302:key/0d7926a7-6ecc-4bf7-a9c1-a38f0faec0cd"]`.

6. Click **Next: Tags**, add a tag of the policy (optional), and then click **Next:Review**.

7. Set a policy name, and then click **Create policy**.
6. Click **Next**.
7. Set a policy name, add a tag of the policy (optional), and then click **Create policy**.

3. In the AWS Management Console, create an access role for TiDB Cloud and get the role ARN.

Expand Down
247 changes: 198 additions & 49 deletions tidb-cloud/serverless-export.md

Large diffs are not rendered by default.

226 changes: 226 additions & 0 deletions tidb-cloud/serverless-external-storage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,226 @@
---
title: Configure TiDB Serverless External Storage Access
summary: Learn how to configure Amazon Simple Storage Service (Amazon S3) access, Google Cloud Storage (GCS) access and Azure Blob Storage access.
---

# Configure External Storage Access for TiDB Serverless

If you want to import data from or export data to an external storage in a TiDB Serverless cluster, you need to configure cross-account access. This document describes how to configure access to an external storage, including Amazon Simple Storage Service (Amazon S3), Google Cloud Storage (GCS) and Azure Blob Storage for TiDB Serverless clusters.

If you need to configure these external storages for a TiDB Dedicated cluster, see [Configure External Storage for TiDB Dedicated](/tidb-cloud/config-s3-and-gcs-access.md).

## Configure Amazon S3 access

To allow a TiDB Serverless cluster to access your Amazon S3 bucket, you need to configure the bucket access for the cluster. You can use either of the following methods to configure the bucket access:

- Use a Role ARN: use a Role ARN to access your Amazon S3 bucket.
- Use an AWS access key: use the access key of an IAM user to access your Amazon S3 bucket.

<SimpleTab>
<div label="Role ARN">

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
It is recommended that you use [AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html) to create a role ARN. Take the following steps to create one:

1. Open the **Import** page for your target cluster.

1. Log in to the [TiDB Cloud console](https://tidbcloud.com/) and navigate to the [**Clusters**](https://tidbcloud.com/console/clusters) page of your project.

2. Click the name of your target cluster to go to its overview page, and then click **Import** in the left navigation pane.

2. Open the **Add New ARN** dialog.

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- If you want to import data from Amazon S3, open the **Add New ARN** dialog as follows:

1. Click **Import from S3**.
2. Fill in the **File URI** field.
3. Choose **AWS Role ARN** and click **Click here to create new one with AWS CloudFormation**.

- If you want to export data to Amazon S3, open the **Add New ARN** dialog as follows:

1. Click **Click here to export data to** > **Amazon S3**.
hfxsd marked this conversation as resolved.
Show resolved Hide resolved
2. Fill in the **Folder URI** field.
3. Choose **AWS Role ARN** and click **Click here to create new one with AWS CloudFormation**.

3. Create a role ARN with AWS CloudFormation template.

1. In the **Add New ARN** dialog, click **AWS Console with CloudFormation Template**.

2. Log in to the [AWS Management Console](https://console.aws.amazon.com/) and you will be redirected to the AWS CloudFormation **Quick create stack** page.

3. Fill in the **Role Name**.

4. Acknowledge to create a new role and click **Create stack** to create the role ARN.

5. After the CloudFormation stack is executed, you can click the **Outputs** tab and find the Role ARN value in the **Value** column.

![img.png](/media/tidb-cloud/serverless-external-storage/serverless-role-arn.png)

If you have any trouble creating a role ARN with AWS CloudFormation, you can take the following steps to create one manually:

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
<details>
<summary>Click here to see details</summary>

1. 1. In the **Add New ARN** dialog described in previous instructions, click **Having trouble? Create Role ARN manually**. You will get the **TiDB Cloud Account ID** and **TiDB Cloud External ID**.

2. In the AWS Management Console, create a managed policy for your Amazon S3 bucket.

1. Sign in to the [AWS Management Console](https://console.aws.amazon.com/) and open the [Amazon S3 console](https://console.aws.amazon.com/s3/).

2. In the **Buckets** list, choose the name of your bucket with the source data, and then click **Copy ARN** to get your S3 bucket ARN (for example, `arn:aws:s3:::tidb-cloud-source-data`). Take a note of the bucket ARN for later use.

![Copy bucket ARN](/media/tidb-cloud/copy-bucket-arn.png)

3. Open the [IAM console](https://console.aws.amazon.com/iam/), click **Policies** in the left navigation pane, and then click **Create Policy**.

![Create a policy](/media/tidb-cloud/aws-create-policy.png)

4. On the **Create policy** page, click the **JSON** tab.

5. Configure the policy in the policy text field according to your needs. The following is an example that you can use to export data from and import data to a TiDB Serverless cluster.

```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:PutObject"
],
"Resource": "<Your S3 bucket ARN>/<Directory of your source data>/*"
},
{
"Sid": "VisualEditor1",
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": "<Your S3 bucket ARN>"
}
]
}
```

In the policy text field, replace the following configurations with your own values.

- `"Resource": "<Your S3 bucket ARN>/<Directory of the source data>/*"`. For example,

- If your source data is stored in the root directory of the `tidb-cloud-source-data` bucket, use `"Resource": "arn:aws:s3:::tidb-cloud-source-data/*"`.
- If your source data is stored in the `mydata` directory of the bucket, use `"Resource": "arn:aws:s3:::tidb-cloud-source-data/mydata/*"`.

Make sure that `/*` is added to the end of the directory so TiDB Cloud can access all files in this directory.

- `"Resource": "<Your S3 bucket ARN>"`, for example, `"Resource": "arn:aws:s3:::tidb-cloud-source-data"`.

- If you have enabled AWS Key Management Service key (SSE-KMS) with customer-managed key encryption, make sure the following configuration is included in the policy. `"arn:aws:kms:ap-northeast-1:105880447796:key/c3046e91-fdfc-4f3a-acff-00597dd3801f"` is a sample KMS key of the bucket.

```
{
"Sid": "AllowKMSkey",
"Effect": "Allow",
"Action": [
"kms:Decrypt"
],
"Resource": "arn:aws:kms:ap-northeast-1:105880447796:key/c3046e91-fdfc-4f3a-acff-00597dd3801f"
}
```

- If the objects in your bucket have been copied from another encrypted bucket, the KMS key value needs to include the keys of both buckets. For example, `"Resource": ["arn:aws:kms:ap-northeast-1:105880447796:key/c3046e91-fdfc-4f3a-acff-00597dd3801f","arn:aws:kms:ap-northeast-1:495580073302:key/0d7926a7-6ecc-4bf7-a9c1-a38f0faec0cd"]`.

6. Click **Next**.

7. Set a policy name, add a tag of the policy (optional), and then click **Create policy**.

3. In the AWS Management Console, create an access role for TiDB Cloud and get the role ARN.

1. In the [IAM console](https://console.aws.amazon.com/iam/), click **Roles** in the left navigation pane, and then click **Create role**.

![Create a role](/media/tidb-cloud/aws-create-role.png)

2. To create a role, fill in the following information:

- In **Trusted entity type**, select **AWS account**.
- In **An AWS account**, select **Another AWS account**, and then paste the TiDB Cloud account ID to the **Account ID** field.
- In **Options**, click **Require external ID (Best practice when a third party will assume this role)**, and then paste the TiDB Cloud External ID to the **External ID** field. If the role is created without a Require external ID, once the configuration is done for one TiDB cluster in a project, all TiDB clusters in that project can use the same Role ARN to access your Amazon S3 bucket. If the role is created with the account ID and external ID, only the corresponding TiDB cluster can access the bucket.

3. Click **Next** to open the policy list, choose the policy you just created, and then click **Next**.

4. In **Role details**, set a name for the role, and then click **Create role** in the lower-right corner. After the role is created, the list of roles is displayed.

5. In the list of roles, click the name of the role that you just created to go to its summary page, and then you can get the role ARN.

![Copy AWS role ARN](/media/tidb-cloud/aws-role-arn.png)

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
</details>

</div>

<div label="Access Key">

It is recommended that you use an IAM user (instead of the AWS account root user) to create an access key.

Take the following steps to configure an access key:

1. Create an IAM user. For more information, see [creating an IAM user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html#id_users_create_console).

2. Use your AWS account ID or account alias, and your IAM user name and password to sign in to [the IAM console](https://console.aws.amazon.com/iam).

3. Create an access key. For more information, see [creating an access key for an IAM user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_CreateAccessKey).

> **Note:**
>
> TiDB Cloud does not store your access keys. It is recommended that you [delete the access key](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html#Using_CreateAccessKey) after the import is complete.

</div>
</SimpleTab>

## Configure GCS access

To allow a TiDB Serverless cluster to access your GCS bucket, you need to configure the GCS access for the bucket. You can use service account key to configure the bucket access:

Take the following steps to configure a service account key:

1. Click **CREATE SERVICE ACCOUNT** to create a service account on the Google Cloud [service account page](https://console.cloud.google.com/iam-admin/serviceaccounts). For more information, see [Creating a service account](https://cloud.google.com/iam/docs/creating-managing-service-accounts).

1. Enter a service account name.
2. Enter a description of the service account (Optional).
3. Click **CREATE AND CONTINUE** to create the service account.
4. In the `Grant this service account access to project`, choose the [IAM roles](https://cloud.google.com/iam/docs/understanding-roles) with the needed permission. For example, exporting data to a TiDB Serverless cluster needs a role with `storage.objects.create` permission.
5. Click **Continue** to go to the next step.
6. Optional: In the `Grant users access to this service account`, choose members that need to [attach the service account to other resources](https://cloud.google.com/iam/docs/attach-service-accounts).
7. Click **Done** to finish creating the service account.

![service-account](/media/tidb-cloud/serverless-external-storage/gcs-service-account.png)

2. Click the service account and then click **ADD KEY** on the `KEYS` page to create a service account key.

![service-account-key](/media/tidb-cloud/serverless-external-storage/gcs-service-account-key.png)

3. Choose the default `JSON` key type and click the **CREATE** button to download the service account key.

## Configure Azure Blob Storage access

To allow TiDB Serverless to access your Azure Blob container, you need to configure the Azure Blob access for the container. You can use a service SAS token to configure the container access:

Take the following steps to configure a service SAS token:

1. Click your storage account where the container belongs to on the [Azure Storage account](https://portal.azure.com/#browse/Microsoft.Storage%2FStorageAccounts) page.

2. On your **Storage account** page, click the **Security+network** and then click the **Shared access signature**.

![sas-position](/media/tidb-cloud/serverless-external-storage/azure-sas-position.png)

3. On the **Shared access signature** page, create a service SAS token with needed permissions as follows. For more information, see [Create a service SAS token](https://docs.microsoft.com/en-us/azure/storage/common/storage-sas-overview).

1. In the **Allowed services** section, choose the **Blob** service.
2. In the **Allowed Resource types** section, choose **Container** and **Object**.
3. In the **Allowed permissions** section, choose the permission as needed. For example, exporting data to a TiDB Serverless cluster needs the **Read** and **Write** permissions.
4. Adjust the **Start and expiry date/time** as needed.
5. You can keep the default values for other settings.

![sas-create](/media/tidb-cloud/serverless-external-storage/azure-sas-create.png)

4. Click the **Generate SAS and connection string** button to generate the SAS token. You will specify this token when you create an external stage.
Loading