Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize part size for checksummed read #315

Merged
merged 2 commits into from
Jun 29, 2023

Conversation

monthonk
Copy link
Contributor

The prefetcher stores data received from each input stream as a part in the part queue structure. Usually, the part size is pretty big (8 MB or more) and the checksum validation always has to be done against an entire part even if we only read a small portion of that part.

This makes checksummed read much slower than non-checksummed read. We could make it more efficient by making the part smaller or ideally align the part size to the read size so that we don't have to compute the checksum on unnecessary bytes.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@monthonk monthonk had a problem deploying to PR integration tests June 26, 2023 15:19 — with GitHub Actions Failure
@monthonk monthonk had a problem deploying to PR integration tests June 26, 2023 15:19 — with GitHub Actions Failure
@monthonk monthonk had a problem deploying to PR integration tests June 26, 2023 15:19 — with GitHub Actions Failure
@monthonk monthonk had a problem deploying to PR integration tests June 26, 2023 15:19 — with GitHub Actions Failure
@monthonk monthonk temporarily deployed to PR integration tests June 26, 2023 15:27 — with GitHub Actions Inactive
@monthonk monthonk had a problem deploying to PR integration tests June 26, 2023 15:27 — with GitHub Actions Failure
@monthonk monthonk had a problem deploying to PR integration tests June 26, 2023 15:27 — with GitHub Actions Failure
@monthonk monthonk had a problem deploying to PR integration tests June 26, 2023 15:27 — with GitHub Actions Failure
The prefetcher stores data received from each input stream as a part in
the part queue structure. Usually, the part size is pretty big (8 MB or
more) and the checksum validation always has to be done against an entire
part even if we only read a small portion of that part.

This makes checksummed read much slower than non-checksummed read. We could
make it more efficient by making the part smaller or ideally align the part
size to the read size so that we don't have to compute the checksum on
unnecessary bytes.

Signed-off-by: Monthon Klongklaew <[email protected]>
@monthonk monthonk temporarily deployed to PR integration tests June 26, 2023 15:39 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 26, 2023 15:39 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 26, 2023 15:39 — with GitHub Actions Inactive
@monthonk monthonk added the performance PRs to run benchmarks on label Jun 26, 2023
@monthonk monthonk temporarily deployed to PR benchmarks June 26, 2023 16:08 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR benchmarks June 26, 2023 16:08 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 26, 2023 20:10 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 26, 2023 20:10 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 26, 2023 20:10 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 26, 2023 20:10 — with GitHub Actions Inactive
Comment on lines 167 to 168
let min_part_size = 128 * 1024;
self.preferred_part_size = length.max(min_part_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you initialize preferred_part_size to 128KiB, this can just be

Suggested change
let min_part_size = 128 * 1024;
self.preferred_part_size = length.max(min_part_size);
self.preferred_part_size = self.preferred_part_size.max(length);

Also worth a comment on why we chose 128KiB.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we set a maximum here, like 1MiB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For choosing 128k, it's the linux readahead size and seem to be a reasonable minimum value. I will update the comment, also happy to change if you have better suggestion.

I'm not sure we should put a maximum value though. If the read size is really big then we will have to combine data from multiple parts and the extend operation is quite expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if we change the logic the use max value between last preferred_part_size and current length I think there should be a maximum, otherwise it will keep growing bigger.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effective maximum is the client's part size anyway, so probably we should enforce it here just to be explicit.

Signed-off-by: Monthon Klongklaew <[email protected]>
@monthonk monthonk had a problem deploying to PR integration tests June 29, 2023 11:12 — with GitHub Actions Failure
@monthonk monthonk temporarily deployed to PR benchmarks June 29, 2023 11:12 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 29, 2023 11:12 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 29, 2023 11:12 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR integration tests June 29, 2023 11:12 — with GitHub Actions Inactive
@monthonk monthonk temporarily deployed to PR benchmarks June 29, 2023 11:13 — with GitHub Actions Inactive
@jamesbornholt jamesbornholt merged commit 971b757 into awslabs:main Jun 29, 2023
16 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance PRs to run benchmarks on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants