Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security review: recover corrupt caches #1382

Open
natalieparellano opened this issue Apr 23, 2024 · 7 comments · Fixed by buildpacks/lifecycle-private#16
Open

Security review: recover corrupt caches #1382

natalieparellano opened this issue Apr 23, 2024 · 7 comments · Fixed by buildpacks/lifecycle-private#16
Assignees
Labels
status/ready type/bug Something isn't working

Comments

@natalieparellano
Copy link
Member

natalieparellano commented Apr 23, 2024

Summary

In the security review, this is LOW-2: Denial-of-Service (DoS) provoked by removing build cache tarballs or altering the OCI image manifest. The action plan asks us to ensure that

If a tarball is missing, a solution should be found by either rebuilding the corresponding tarball or wiping out the cache in order to continue the containerization process without errors, or, a second execution should be possible without errors

Further context from the initial report:

It appears that if a tarball referenced in the io.buildpacks.lifecycle.cache.metadata file is absent on the container filesystem (mounted host volume), the application containerization process quit without wiping out the cache content.


Proposal

  • See here where the restorer will fail if the cache does not contain a layer with the expected diffID:
    rc, err := cache.RetrieveLayer(sha)
  • See here where the exporter blows up if a cached layer has no contents:
    return fmt.Errorf("layer '%s' is cache=true but has no contents", fsLayer.Identifier())
    (we should keep this one?)
  • When the first error is encountered, we should wipe the cache.

Alternatively, we considered updating the cache metadata to exclude the layer/tarball that is missing. But, we are not sure if this scenario is common enough to warrant such a surgical approach.


Related

RFC #___


Context

@joeybrown-sf
Copy link
Contributor

Here's a demo of the broken behavior.

demo-failure

@joeybrown-sf
Copy link
Contributor

And here's a demo of the expected behavior.

expected-demo

@joeybrown-sf
Copy link
Contributor

I've got things working on my branch corrupted-cache. But I still need to add at least one test for the Exporter change.

@natalieparellano I went a different direction that we originally discussed. Instead of wiping out the cache, I am instead just ignoring the missing files. What do you think? Happy to do whatever.

@joeybrown-sf
Copy link
Contributor

cc @jabrown85

@natalieparellano
Copy link
Member Author

Instead of wiping out the cache, I am instead just ignoring the missing files.

Sounds good to me :) should we land this one?

@jabrown85
Copy link
Contributor

I think @joeybrown-sf was hoping to fix up a test or something? What remains here @joeybrown-sf ?

@natalieparellano
Copy link
Member Author

natalieparellano commented Jul 17, 2024

Circling back, see discussion on buildpacks/lifecycle-private#16, our fix is working for volume caches, but for image caches we are NOT getting "layer not found" errors where we expect them (and hence are failing and bubbling up the error instead of skipping over the layer). This requires further investigation in imgutil. We added FIXMEs in the code so that it is apparent that the image cache flow requires further work. We could leave this issue open or create another issue that is specific to image caches.

@natalieparellano natalieparellano transferred this issue from buildpacks/lifecycle-private Jul 17, 2024
@natalieparellano natalieparellano added type/bug Something isn't working status/ready labels Aug 13, 2024
@natalieparellano natalieparellano added this to the lifecycle 0.21.0 milestone Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/ready type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants