Skip to content
This repository has been archived by the owner on Aug 12, 2024. It is now read-only.

update image source unpacking to use a direct image registry client #874

Merged

Conversation

everettraven
Copy link
Contributor

Description:

  • This updates rukpak's image source unpack process to be similar to catalogd's by using a direct image registry client for pulling and unpacking bundle contents. Most of the logic was copied and adapted from catalogd's implementation and as such includes:
    • bundle unpack content caching (for identifying changes)
    • unpack content cache period garbage collection (in the event any BundleDeployment deletes were missed for cleaning up the content cache)

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 2, 2024
Copy link

codecov bot commented May 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 37.27%. Comparing base (d747ca9) to head (f33ba2b).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #874   +/-   ##
=======================================
  Coverage   37.27%   37.27%           
=======================================
  Files           9        9           
  Lines         845      845           
=======================================
  Hits          315      315           
  Misses        486      486           
  Partials       44       44           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 3, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 6, 2024
@everettraven everettraven marked this pull request as ready for review May 6, 2024 19:45
@everettraven everettraven requested a review from a team as a code owner May 6, 2024 19:45
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 6, 2024
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 7, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 7, 2024
internal/garbagecollection/garbagecollector.go Outdated Show resolved Hide resolved
@@ -73,3 +75,5 @@ spec:
volumes:
- name: bundle-cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing the benefit of having bundle-cache and unpack-cache here, that contain the same bundle contents. Initially the use of localdir store was to copy the unpacked contents from pod log (for image source at least) and store locally.

Disregarding other sources (which we are not looking to support) - the registry client can directly unpack contents at a single location that we look at for generating charts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree, but my intention behind this PR is to get the functionality in place and we can optimize in the future. catalogd is going to need the same updates in the future, so ideally we can spend some time updating this implementation to be something that can be used in both catalogd and operator-controller in the future.

I'm not against making the changes in this PR, but the scope of this PR is already quite large with the necessary e2e changes and bringing it to parity with catalogd is a good starting point IMO. We can always iterate and improve in the future.

internal/garbagecollection/garbagecollector.go Outdated Show resolved Hide resolved
@@ -0,0 +1,75 @@
#! /bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having these separate files sharing a lot of common scripting, can we instead just have one script with arguments and have each sub-folder contain the needed yaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally had it as a single script that did this, but since some of the testdata has specific nuances for the tests there was an increase in the single script complexity that I felt would only continue to get more complex when/if new testdata is added. Some examples:

  • Two different bundle formats, registry+v1 and plain+v0, have different directory requirements. This means that a common script must account for this difference and as such introduces more complex logic to ensure there are no flakes when building each bundle.
  • One of the plain+v0 testdata bundles requires inclusion of subdirectories. this requires more complex configmap creation and volume mounting logic to ensure the subdirectory is copied over to the container running the build + push operations.
  • Ability to push to a registry that requires authentication requires additional volume mounts and configuration of the container to successfully authenticate with the registry.

I felt that reducing the script complexity and increasing "repetition" is likely going to be more maintainable in the long run since additional test cases can get as custom as they need to and provide all the logic to ensure the image is appropriately built and pushed to the on cluster registry.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to reduce down to a single script and make the specific nuances encapsulated with some other mechanism? For example, it seems like the nuances are in the shape of the bundles. Can we tar.gz the root directories and then have the script deal in tar.gz files generically. Configmap data is tar.gz, and the build/push pod is changed to untar prior to running the build?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure we could do something to make it one or two scripts (likely 2 for the pushing to a registry requiring auth), but what do we gain from that vs the current approach?

I understand the desire to reduce the repetition of the scripting but I also see value in having the ability to have each "bundle" specify how it should be built rather than trying to optimize for a single common script

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed via Slack I created #885 to track this as a follow up to this PR

Copy link
Member

@joelanford joelanford May 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but what do we gain from that vs the current approach?

Simplicity

In rukpak, a bundle is an fs.FS. So given that we're going for consistency and simplicity, it seems reasonable for us to have a Dockerfile like this that we can re-use for any bundle.

FROM scratch
ARG destBundleRoot=/
ADD bundle.tar.gz $destBundleRoot

And then the configmap would contain this Dockerfile entry and a bundle.tar.gz entry, which we could mount into a pod to do a generic build.

go.mod Show resolved Hide resolved
Copy link
Member

@varshaprasad96 varshaprasad96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, given we are going to discuss #874 (comment) and #874 (comment) in follow ups!

Thanks for working on this, @everettraven!
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 13, 2024
@everettraven everettraven added this pull request to the merge queue May 13, 2024
Merged via the queue into operator-framework:main with commit bb7cd98 May 13, 2024
9 of 10 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants