Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add delimiter support for S3 List API #2996

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Arun-LinkedIn
Copy link
Contributor

This adds support for treating "/" in blob names as delimiter. It allows to treat prefixes as "subdirectories" and group them under "CommonPrefixes" response in the LIST API. This also enables us to list and delete directories using AWS S3 CLIs

More details can be found in https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html under "CommonPrefixes" section.

For example, if we have below files (with names delimited by / to represent directories)

Ambry account: named-blob-sandbox
Ambry container: container-a
1. myfile.txt
2.folder/file1.txt
3.folder/file2.txt

1. cURL response for List API is 

curl -X GET 'http://localhost:1174/s3/named-blob-sandbox/container-a?prefix=folder&delimiter=/' | xmllint --format -
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   409  100   409    0     0   3976      0 --:--:-- --:--:-- --:--:--  3970
<?xml version="1.0"?>
<ListBucketResult>
  <Name>container-a</Name>
  <Prefix>folder</Prefix>
  <MaxKeys>1000</MaxKeys>
  <KeyCount>2</KeyCount>
  <Delimiter>/</Delimiter>
  <Contents>
    <Key>folder/file1.txt</Key>
    <LastModified>2025-01-27T09:37:23Z</LastModified>
    <Size>15</Size>
  </Contents>
  <Contents>
    <Key>folder/file2.txt</Key>
    <LastModified>2025-01-27T09:37:23Z</LastModified>
    <Size>15</Size>
  </Contents>
  <IsTruncated>false</IsTruncated>
</ListBucketResult>

2. AWS S3 CLI responses for list and remove directories are:

>aws s3 ls s3://container-a/ --recursive
2025-01-26 12:39:31         15 folder/file1.txt
2025-01-26 12:39:31         15 folder/file2.txt
2025-01-26 12:37:51         15 myfile.txt

>aws s3 ls s3://container-a/
                               PRE folder/
2025-01-26 12:37:51    15      myfile.txt

>aws s3 rm s3://container-a/folder/ --recursive
delete: s3://container-a/folder/file1.txt
delete: s3://container-a/folder/file2.txt

@codecov-commenter
Copy link

codecov-commenter commented Jan 27, 2025

Codecov Report

Attention: Patch coverage is 0% with 66 lines in your changes missing coverage. Please review.

Project coverage is 39.62%. Comparing base (52ba813) to head (399cb87).
Report is 166 commits behind head on master.

Files with missing lines Patch % Lines
...com/github/ambry/frontend/s3/S3MessagePayload.java 0.00% 16 Missing ⚠️
.../java/com/github/ambry/named/MySqlNamedBlobDb.java 0.00% 16 Missing ⚠️
...va/com/github/ambry/frontend/s3/S3ListHandler.java 0.00% 15 Missing ⚠️
.../com/github/ambry/frontend/NamedBlobListEntry.java 0.00% 8 Missing ⚠️
...n/java/com/github/ambry/named/NamedBlobRecord.java 0.00% 4 Missing ⚠️
...b/ambry/tools/perf/NamedBlobMysqlDatabasePerf.java 0.00% 4 Missing ⚠️
...om/github/ambry/frontend/NamedBlobListHandler.java 0.00% 3 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (52ba813) and HEAD (399cb87). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (52ba813) HEAD (399cb87)
3 2
Additional details and impacted files
@@              Coverage Diff              @@
##             master    #2996       +/-   ##
=============================================
- Coverage     64.24%   39.62%   -24.62%     
+ Complexity    10398     6281     -4117     
=============================================
  Files           840      884       +44     
  Lines         71755    74829     +3074     
  Branches       8611     8976      +365     
=============================================
- Hits          46099    29651    -16448     
- Misses        23004    42850    +19846     
+ Partials       2652     2328      -324     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -706,10 +709,27 @@ private Page<NamedBlobRecord> run_list_v2(String accountName, String containerNa
int resultIndex = 0;
while (resultSet.next()) {
String blobName = resultSet.getString(1);
if (resultIndex++ == maxKeysValue) {
if (resultIndex == maxKeysValue) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only set nextContinuationToken to the blobName when resultIndex == maxKeysValue. The idea is to only do that when it's the last key, but now that we change the logic when incrementing resultIndex. resultIndex would not equal to maxKeysValue in the last key, so we probably have to use an other index, or other way to set next continuation token here.

// Extract the portion after the prefix and before the next '/'
String remainingPath = blobName.substring(blobNamePrefix == null ? 0 : blobNamePrefix.length());
remainingPath = remainingPath.startsWith("/") ? remainingPath.substring(1) : remainingPath;
int delimiterIndex = remainingPath.indexOf("/");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use a constant for "/".

if (groupDirectories) {
// Extract the portion after the prefix and before the next '/'
String remainingPath = blobName.substring(blobNamePrefix == null ? 0 : blobNamePrefix.length());
remainingPath = remainingPath.startsWith("/") ? remainingPath.substring(1) : remainingPath;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not right, if the prefix is "abc" and this blobname is "abc/efg/hij", and the delimiter is "/", then the we should "abc/" as common prefix, not "abc/efg/".

if (groupDirectories) {
// Add the directories to the result
entries.addAll(directories.stream()
.map(directory -> new NamedBlobRecord(accountName, containerName, directory, null, Utils.Infinite_Time, 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not right, when we are returning a common prefix, it should include the prefix as well. if we have "abc/def/ghi", and the prefix is "abc/", and the common prefix should be "abc/def/", not jsut "def/".

remainingPath = remainingPath.startsWith("/") ? remainingPath.substring(1) : remainingPath;
int delimiterIndex = remainingPath.indexOf("/");
if (delimiterIndex != -1) {
boolean validEntry = directories.add(remainingPath.substring(0, delimiterIndex) + "/");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it could be remainingPath.substring(0, delimiterIndex +DELIMITER_STRING.length());

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants