Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] bundle: Parallel download and decompression #4504

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vyasgun
Copy link
Contributor

@vyasgun vyasgun commented Dec 6, 2024

Description

This pull request does the following:

  • Return a reader from the bundle Download function.
  • Use the reader to stream the bytes to Extract function.

This commit replaces grab client with the net/http client to ensure that the bytes are streamed come in correct order to the Extract func. Currently, only zst decompression is being used in the UncompressWithReader function as it is the primary compression algorithm being used in crc.

The download progress bar has been removed temporarily and will be added back as part of refactoring the code.

Fixes: #4336

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change
  • Chore (non-breaking change which doesn't affect codebase;
    test, version modification, documentation, etc.)

Proposed changes

  • Return a reader from the bundle Download function.
  • Use the reader to stream the bytes to Extract function.

Testing

Contribution Checklist

  • I have read the contributing guidelines
  • My code follows the style guidelines of this project
  • I Keep It Small and Simple: The smaller the PR is, the easier it is to review and have it merged
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I tested my code on specified platforms
    • Linux
    • Windows
    • MacOS

Copy link

openshift-ci bot commented Dec 6, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@vyasgun
Copy link
Contributor Author

vyasgun commented Dec 6, 2024

/test all

Copy link

openshift-ci bot commented Dec 6, 2024

@vyasgun: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-crc 6cf717b link true /test e2e-crc
ci/prow/images 6cf717b link true /test images
ci/prow/security 6cf717b link false /test security
ci/prow/integration-crc 6cf717b link true /test integration-crc
ci/prow/e2e-microshift-crc 6cf717b link true /test e2e-microshift-crc

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@vyasgun vyasgun force-pushed the pr/parallel-decompress branch 9 times, most recently from bb0b17c to 3a62d1a Compare December 7, 2024 17:34
Copy link
Contributor

@redbeam redbeam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this is great work and it's functional.

My findings:

  • cancelling through web api (socket) works

  • better logging would be nice (currently it's skipping the download part) so that it's clear that the download and uncompression is being done simultaneously

  • progress bar could show more info about both processes

  • resuming interrupted download doesn't work - everything starts from the beginning

  • golangci-lint issues

Suggestions:

  • add (cli/config) option to disable this functionality (revert back to old behavior)

pkg/download/download.go Outdated Show resolved Hide resolved
pkg/crc/machine/bundle/metadata.go Outdated Show resolved Hide resolved
pkg/crc/image/image.go Outdated Show resolved Hide resolved
pkg/crc/machine/bundle/repository.go Show resolved Hide resolved
@@ -124,6 +125,36 @@ func (bundle *CrcBundleInfo) createSymlinkOrCopyPodmanRemote(binDir string) erro
return bundle.copyExecutableFromBundle(binDir, PodmanExecutable, constants.PodmanRemoteExecutableName)
}

func (repo *Repository) ExtractWithReader(ctx context.Context, reader io.Reader, path string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function and Extract are very similar, could they be merged in some way?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract could probably use os.Open and call into ExtractWithReader

pkg/extract/extract.go Outdated Show resolved Hide resolved
pkg/extract/extract.go Outdated Show resolved Hide resolved
@@ -124,6 +125,36 @@ func (bundle *CrcBundleInfo) createSymlinkOrCopyPodmanRemote(binDir string) erro
return bundle.copyExecutableFromBundle(binDir, PodmanExecutable, constants.PodmanRemoteExecutableName)
}

func (repo *Repository) ExtractWithReader(ctx context.Context, reader io.Reader, path string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extract could probably use os.Open and call into ExtractWithReader

@@ -163,7 +163,7 @@ func downloadDataFiles(goos string, components []string, destDir string) ([]stri
if !shouldDownload(components, componentName) {
continue
}
filename, err := download.Download(context.TODO(), dl.url, destDir, dl.permissions, nil)
_, filename, err := download.Download(context.TODO(), dl.url, destDir, dl.permissions, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd add a download.DownloadFile(…) (string, error) to make it clear when we don't need the reader.

pkg/crc/image/image.go Outdated Show resolved Hide resolved
pkg/crc/machine/bundle/metadata.go Outdated Show resolved Hide resolved
pkg/crc/machine/bundle/repository.go Show resolved Hide resolved
logging.Infof("Extracting bundle: %s...", bundleName)
if _, err := bundle.Extract(ctx, bundlePath); err != nil {
if _, err := bundle.Extract(ctx, reader, bundlePath); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a bundlePath and a reader feels a bit redundant, ideally we could pass one or the other, but I'm not sure it is currently that easy.

This commit does the following:
- Return a reader from the bundle Download function.
- Use the reader to stream the bytes to Extract function.

This commit replaces grab client with the net/http client to ensure
that the bytes are streamed come in correct order to the Extract func.
Currently, only zst decompression is being used in the
UncompressWithReader function as it is the primary compression algorithm
being used in crc.
@vyasgun vyasgun force-pushed the pr/parallel-decompress branch from 3a62d1a to bb33dee Compare January 10, 2025 05:38
Copy link

openshift-ci bot commented Jan 10, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign adrianriobo for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parallel bundle download & decompression
3 participants