Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
docs-action committed Dec 3, 2023
1 parent b9e3fd9 commit 4cd1899
Show file tree
Hide file tree
Showing 2 changed files with 60 additions and 3 deletions.
2 changes: 1 addition & 1 deletion assets/js/search-data.json
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@
},"46": {
"doc": "AWS",
"title": "Prepare your S3 bucket",
"content": ". | From the S3 Administration console, choose Create Bucket. | Use the following as your bucket policy, filling in the placeholders: . | Standard Permissions | Minimal Permissions (Advanced) | . { \"Id\": \"lakeFSPolicy\", \"Version\": \"2012-10-17\", \"Statement\": [ { \"Sid\": \"lakeFSObjects\", \"Action\": [ \"s3:GetObject\", \"s3:PutObject\", \"s3:AbortMultipartUpload\", \"s3:ListMultipartUploadParts\" ], \"Effect\": \"Allow\", \"Resource\": [\"arn:aws:s3:::[BUCKET_NAME_AND_PREFIX]/*\"], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } }, { \"Sid\": \"lakeFSBucket\", \"Action\": [ \"s3:ListBucket\", \"s3:GetBucketLocation\", \"s3:ListBucketMultipartUploads\" ], \"Effect\": \"Allow\", \"Resource\": [\"arn:aws:s3:::[BUCKET]\"], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } } ] } . | Replace [BUCKET_NAME], [ACCOUNT_ID] and [IAM_ROLE] with values relevant to your environment. | [BUCKET_NAME_AND_PREFIX] can be the bucket name. If you want to minimize the bucket policy permissions, use the bucket name together with a prefix (e.g. example-bucket/a/b/c). This way, lakeFS will be able to create repositories only under this specific path (see: Storage Namespace). | lakeFS will try to assume the role [IAM_ROLE]. | . If required lakeFS can operate without accessing the data itself, this permission section is useful if you are using presigned URLs mode or the lakeFS Hadoop FileSystem Spark integration. Since this FileSystem performs many operations directly on the storage, lakeFS requires less permissive permissions, resulting in increased security. lakeFS always requires permissions to access the _lakefs prefix under your storage namespace, in which metadata is stored (learn more). By setting this policy without presign mode you’ll be able to perform only metadata operations through lakeFS, meaning that you’ll not be able to use lakeFS to upload or download objects. Specifically you won’t be able to: . | Upload objects using the lakeFS GUI (Works with presign mode) | Upload objects through Spark using the S3 gateway | Run lakectl fs commands (unless using presign mode with --pre-sign flag) | Use Actions and Hooks | . { \"Id\": \"[POLICY_ID]\", \"Version\": \"2012-10-17\", \"Statement\": [ { \"Sid\": \"lakeFSObjects\", \"Action\": [ \"s3:GetObject\", \"s3:PutObject\" ], \"Effect\": \"Allow\", \"Resource\": [ \"arn:aws:s3:::[STORAGE_NAMESPACE]/_lakefs/*\" ], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } }, { \"Sid\": \"lakeFSBucket\", \"Action\": [ \"s3:ListBucket\", \"s3:GetBucketLocation\" ], \"Effect\": \"Allow\", \"Resource\": [\"arn:aws:s3:::[BUCKET]\"], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } } ] } . We can use presigned URLs mode without allowing access to the data from the lakeFS server directly. We can achieve this by using condition keys such as aws:referer, aws:SourceVpc and aws:SourceIp. For example, assume the following scenario: . | lakeFS is deployed outside the company (i.e lakeFS cloud or other VPC not vpc-123) | We don’t want lakeFS to be able to access the data, so we use presign URL, we still need lakeFS role to be able to sign the URL. | We want to allow access from the internal company VPC: vpc-123. | . { \"Sid\": \"allowLakeFSRoleFromCompanyOnly\", \"Effect\": \"Allow\", \"Principal\": { \"AWS\": \"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\" }, \"Action\": [ \"s3:GetObject\", \"s3:PutObject\", ], \"Resource\": [ \"arn:aws:s3:::[BUCKET]/*\", ], \"Condition\": { \"StringEquals\": { \"aws:SourceVpc\": \"vpc-123\" } } } . | . Alternative: use an AWS user . lakeFS can authenticate with your AWS account using an AWS user, using an access key and secret. To allow this, change the policy’s Principal accordingly: . \"Principal\": { \"AWS\": [\"arn:aws:iam::<ACCOUNT_ID>:user/<IAM_USER>\"] } . ",
"content": ". | Take note of the bucket name you want to use with lakeFS | Use the following as your bucket policy, filling in the placeholders: . | Standard Permissions | Standard Permissions (with s3express) | Minimal Permissions (Advanced) | . { \"Id\": \"lakeFSPolicy\", \"Version\": \"2012-10-17\", \"Statement\": [ { \"Sid\": \"lakeFSObjects\", \"Action\": [ \"s3:GetObject\", \"s3:PutObject\", \"s3:AbortMultipartUpload\", \"s3:ListMultipartUploadParts\" ], \"Effect\": \"Allow\", \"Resource\": [\"arn:aws:s3:::[BUCKET_NAME_AND_PREFIX]/*\"], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } }, { \"Sid\": \"lakeFSBucket\", \"Action\": [ \"s3:ListBucket\", \"s3:GetBucketLocation\", \"s3:ListBucketMultipartUploads\" ], \"Effect\": \"Allow\", \"Resource\": [\"arn:aws:s3:::[BUCKET]\"], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } } ] } . | Replace [BUCKET_NAME], [ACCOUNT_ID] and [IAM_ROLE] with values relevant to your environment. | [BUCKET_NAME_AND_PREFIX] can be the bucket name. If you want to minimize the bucket policy permissions, use the bucket name together with a prefix (e.g. example-bucket/a/b/c). This way, lakeFS will be able to create repositories only under this specific path (see: Storage Namespace). | lakeFS will try to assume the role [IAM_ROLE]. | . To use an S3 Express One Zone directory bucket, use the following policy. Note the lakeFSDirectoryBucket statement which is specifically required for using a directory bucket. { \"Id\": \"lakeFSPolicy\", \"Version\": \"2012-10-17\", \"Statement\": [ { \"Sid\": \"lakeFSObjects\", \"Action\": [ \"s3:GetObject\", \"s3:PutObject\", \"s3:AbortMultipartUpload\", \"s3:ListMultipartUploadParts\" ], \"Effect\": \"Allow\", \"Resource\": [\"arn:aws:s3:::[BUCKET_NAME_AND_PREFIX]/*\"], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } }, { \"Sid\": \"lakeFSBucket\", \"Action\": [ \"s3:ListBucket\", \"s3:GetBucketLocation\", \"s3:ListBucketMultipartUploads\" ], \"Effect\": \"Allow\", \"Resource\": [\"arn:aws:s3:::[BUCKET]\"], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } }, { \"Sid\": \"lakeFSDirectoryBucket\", \"Action\": [ \"s3express:CreateSession\" ], \"Effect\": \"Allow\", \"Resource\": \"arn:aws:s3express:[REGION]:[ACCOUNT_ID]:bucket/[BUCKET_NAME]\" } ] } . | Replace [BUCKET_NAME], [ACCOUNT_ID] and [IAM_ROLE] with values relevant to your environment. | [BUCKET_NAME_AND_PREFIX] can be the bucket name. If you want to minimize the bucket policy permissions, use the bucket name together with a prefix (e.g. example-bucket/a/b/c). This way, lakeFS will be able to create repositories only under this specific path (see: Storage Namespace). | lakeFS will try to assume the role [IAM_ROLE]. | . If required lakeFS can operate without accessing the data itself, this permission section is useful if you are using presigned URLs mode or the lakeFS Hadoop FileSystem Spark integration. Since this FileSystem performs many operations directly on the storage, lakeFS requires less permissive permissions, resulting in increased security. lakeFS always requires permissions to access the _lakefs prefix under your storage namespace, in which metadata is stored (learn more). By setting this policy without presign mode you’ll be able to perform only metadata operations through lakeFS, meaning that you’ll not be able to use lakeFS to upload or download objects. Specifically you won’t be able to: . | Upload objects using the lakeFS GUI (Works with presign mode) | Upload objects through Spark using the S3 gateway | Run lakectl fs commands (unless using presign mode with --pre-sign flag) | Use Actions and Hooks | . { \"Id\": \"[POLICY_ID]\", \"Version\": \"2012-10-17\", \"Statement\": [ { \"Sid\": \"lakeFSObjects\", \"Action\": [ \"s3:GetObject\", \"s3:PutObject\" ], \"Effect\": \"Allow\", \"Resource\": [ \"arn:aws:s3:::[STORAGE_NAMESPACE]/_lakefs/*\" ], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } }, { \"Sid\": \"lakeFSBucket\", \"Action\": [ \"s3:ListBucket\", \"s3:GetBucketLocation\" ], \"Effect\": \"Allow\", \"Resource\": [\"arn:aws:s3:::[BUCKET]\"], \"Principal\": { \"AWS\": [\"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\"] } } ] } . We can use presigned URLs mode without allowing access to the data from the lakeFS server directly. We can achieve this by using condition keys such as aws:referer, aws:SourceVpc and aws:SourceIp. For example, assume the following scenario: . | lakeFS is deployed outside the company (i.e lakeFS cloud or other VPC not vpc-123) | We don’t want lakeFS to be able to access the data, so we use presign URL, we still need lakeFS role to be able to sign the URL. | We want to allow access from the internal company VPC: vpc-123. | . { \"Sid\": \"allowLakeFSRoleFromCompanyOnly\", \"Effect\": \"Allow\", \"Principal\": { \"AWS\": \"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]\" }, \"Action\": [ \"s3:GetObject\", \"s3:PutObject\", ], \"Resource\": [ \"arn:aws:s3:::[BUCKET]/*\", ], \"Condition\": { \"StringEquals\": { \"aws:SourceVpc\": \"vpc-123\" } } } . | . Alternative: use an AWS user . lakeFS can authenticate with your AWS account using an AWS user, using an access key and secret. To allow this, change the policy’s Principal accordingly: . \"Principal\": { \"AWS\": [\"arn:aws:iam::<ACCOUNT_ID>:user/<IAM_USER>\"] } . ",
"url": "/howto/deploy/aws.html#prepare-your-s3-bucket",

"relUrl": "/howto/deploy/aws.html#prepare-your-s3-bucket"
Expand Down
61 changes: 59 additions & 2 deletions howto/deploy/aws.html
Original file line number Diff line number Diff line change
Expand Up @@ -744,11 +744,14 @@ <h2 id="prepare-your-s3-bucket">


<ol>
<li>From the S3 Administration console, choose <em>Create Bucket</em>.</li>
<li>Use the following as your bucket policy, filling in the placeholders:
<li>Take note of the bucket name you want to use with lakeFS</li>
<li>
<p>Use the following as your bucket policy, filling in the placeholders:</p>

<div class="tabs">
<ul>
<li><a href="#bucket-policy-standard">Standard Permissions</a></li>
<li><a href="#bucket-policy-minimal">Standard Permissions (with s3express)</a></li>
<li><a href="#bucket-policy-minimal">Minimal Permissions (Advanced)</a></li>
</ul>
<div id="bucket-policy-standard">
Expand Down Expand Up @@ -786,6 +789,60 @@ <h2 id="prepare-your-s3-bucket">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div> </div>

<ul>
<li>Replace <code class="language-plaintext highlighter-rouge">[BUCKET_NAME]</code>, <code class="language-plaintext highlighter-rouge">[ACCOUNT_ID]</code> and <code class="language-plaintext highlighter-rouge">[IAM_ROLE]</code> with values relevant to your environment.</li>
<li><code class="language-plaintext highlighter-rouge">[BUCKET_NAME_AND_PREFIX]</code> can be the bucket name. If you want to minimize the bucket policy permissions, use the bucket name together with a prefix (e.g. <code class="language-plaintext highlighter-rouge">example-bucket/a/b/c</code>).
This way, lakeFS will be able to create repositories only under this specific path (see: <a href="/understand/model.html#repository">Storage Namespace</a>).</li>
<li>lakeFS will try to assume the role <code class="language-plaintext highlighter-rouge">[IAM_ROLE]</code>.</li>
</ul>
</div>
<div id="bucket-policy-express">

<p>To use an S3 Express One Zone <em>directory bucket</em>, use the following policy. Note the <code class="language-plaintext highlighter-rouge">lakeFSDirectoryBucket</code> statement which is specifically required for using a directory bucket.</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
</span><span class="nl">"Id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"lakeFSPolicy"</span><span class="p">,</span><span class="w">
</span><span class="nl">"Version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2012-10-17"</span><span class="p">,</span><span class="w">
</span><span class="nl">"Statement"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"Sid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"lakeFSObjects"</span><span class="p">,</span><span class="w">
</span><span class="nl">"Action"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s2">"s3:GetObject"</span><span class="p">,</span><span class="w">
</span><span class="s2">"s3:PutObject"</span><span class="p">,</span><span class="w">
</span><span class="s2">"s3:AbortMultipartUpload"</span><span class="p">,</span><span class="w">
</span><span class="s2">"s3:ListMultipartUploadParts"</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="nl">"Effect"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Allow"</span><span class="p">,</span><span class="w">
</span><span class="nl">"Resource"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"arn:aws:s3:::[BUCKET_NAME_AND_PREFIX]/*"</span><span class="p">],</span><span class="w">
</span><span class="nl">"Principal"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"AWS"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]"</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"Sid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"lakeFSBucket"</span><span class="p">,</span><span class="w">
</span><span class="nl">"Action"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s2">"s3:ListBucket"</span><span class="p">,</span><span class="w">
</span><span class="s2">"s3:GetBucketLocation"</span><span class="p">,</span><span class="w">
</span><span class="s2">"s3:ListBucketMultipartUploads"</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="nl">"Effect"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Allow"</span><span class="p">,</span><span class="w">
</span><span class="nl">"Resource"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"arn:aws:s3:::[BUCKET]"</span><span class="p">],</span><span class="w">
</span><span class="nl">"Principal"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"AWS"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"arn:aws:iam::[ACCOUNT_ID]:role/[IAM_ROLE]"</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">},</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"Sid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"lakeFSDirectoryBucket"</span><span class="p">,</span><span class="w">
</span><span class="nl">"Action"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="s2">"s3express:CreateSession"</span><span class="w">
</span><span class="p">],</span><span class="w">
</span><span class="nl">"Effect"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Allow"</span><span class="p">,</span><span class="w">
</span><span class="nl">"Resource"</span><span class="p">:</span><span class="w"> </span><span class="s2">"arn:aws:s3express:[REGION]:[ACCOUNT_ID]:bucket/[BUCKET_NAME]"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div> </div>

<ul>
Expand Down

0 comments on commit 4cd1899

Please sign in to comment.