Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't require immutable reference in build script at SLSA 4 #71

Open
TomHennen opened this issue Jun 22, 2021 · 20 comments
Open

Don't require immutable reference in build script at SLSA 4 #71

TomHennen opened this issue Jun 22, 2021 · 20 comments
Labels
slsa 4 Applies to a SLSA 4 requirement spec-change Modification to the spec (requirements, schema, etc.)

Comments

@TomHennen
Copy link
Contributor

Currently the Build Requirements say

All transitive build steps, sources, and dependencies were fully declared up front with immutable references

and

The user-defined build script:

MUST declare all dependencies, including sources and other build steps, using immutable references in a format that the build service understands.

Is it actually necessary that the build script specify the immutable references? That would require users to manually update each build script for each dependency in order to get the latest version. Would it instead be reasonable to simply require that the builder resolve the dependency into an immutable reference and include that in the provenance?

If we actually want users to specify the exact reference they want pulled, could that be a separate ('Version Pinning'?) requirement? That might add some clarity.

@dlorenc
Copy link

dlorenc commented Jun 22, 2021

+1 to changing the requirement to requiring immutable references somewhere, a builder resolving them can be even better in some cases.

@MarkLodato
Copy link
Member

Renaming "Hermetic" to "Version Pinned" SGTM, since that's a more descriptive name (see also #60). I can also clarify that the version pins don't need to be in the build script per se but rather somewhere in the source or one of its dependencies, transitively. (That is, the whole dep tree is pinned by immutable refs.)

Note that we have two closely related requirements: Hermetic (Version Pinned) and Dependencies Complete. Both require the build service to do the fetching (or otherwise intercept network connections). Otherwise, how could the build service guarantee that those properties are true?

I can't think of a way to require a lighter version of this at a lower level. For example, it would be nice to require pinning without requiring the "builder must fetch all deps" piece. But how could we verify that it's true?

One possibility is to defer Version Pinned to a future SLSA 5 and require Dependencies Complete at SLSA 4. This would allow mutable refs, but still require interception / sandboxing / no network access. Not sure if this really helps though.

@dlorenc
Copy link

dlorenc commented Jun 22, 2021

Renaming "Hermetic" to "Version Pinned"

I think the two are orthogonal and both good. You can version pin without being fully hermetic (go.mod does this well). Hermeticity is still a good practice on top of this though, to protect against different threats like a build tool compromise.

I think my gradient would look like:

  • nothing at all
  • dependencies declared
  • dependencies fully pinned (strategies vary based on ecosystem)
  • fully hermetic, guaranteed dependencies complete

@TomHennen
Copy link
Contributor Author

Given this Veracode report [thanks @inferno-chromium], is it actually a given that version pinning is always good? It seems like in some of those cases if there weren't version pinning the products might be in a better state from a vulnerability standpoint.

Maybe version pinning is only good if you have some automated system to bump the pinned version?

@dlorenc
Copy link

dlorenc commented Jun 22, 2021

Maybe version pinning is only good if you have some automated system to bump the pinned version?

This is definitely a matter of opinion with no clear correct answer. But I still come down on the side of version pinning for reproducibility (not necessarily byte for byte, but the same deps going in each time) and usable metadata. Flexible versions are fine for libraries with no build targets, but any actual artifacts should be built from pinned versions.

@inferno-chromium
Copy link
Contributor

Maybe version pinning is only good if you have some automated system to bump the pinned version?

This is definitely a matter of opinion with no clear correct answer. But I still come down on the side of version pinning for reproducibility (not necessarily byte for byte, but the same deps going in each time) and usable metadata. Flexible versions are fine for libraries with no build targets, but any actual artifacts should be built from pinned versions.

Yes, ideal scenario is version pinning, with use of automated deps update tools like dependabot and renovatebot. There are some great examples on google cloud api org (e.g. https://github.com/googleapis/googleapis) who use this successfully on all of their org repos.

@MarkLodato
Copy link
Member

I think my gradient would look like:

  • nothing at all
  • dependencies declared
  • dependencies fully pinned (strategies vary based on ecosystem)
  • fully hermetic, guaranteed dependencies complete

But how can we verify the two intermediate states? What would we put in the provenance, and where would that information come from? Without that, it's not possible to verify that a given artifact meets a given level.

@TomHennen
Copy link
Contributor Author

In the two intermediate states the builder would still fetch the dependencies for the build, so it can record the hashes.

@dlorenc
Copy link

dlorenc commented Jun 22, 2021

Build steps can declare where their dependencies (and build steps) are defined (in source). The build system could translate this into the provenance. If the build system does any fetching (typically for build steps) it could also include those digests.

I think this information would belong in materials.

@MarkLodato
Copy link
Member

Yes, that can happen at any level. But what is the requirement for those intermediate states?

@dlorenc
Copy link

dlorenc commented Jun 22, 2021

I would actually separate out build steps from the other dependencies, I think that's more clear and also helps.

For build steps:

At L2 or 3 (unsure yet): Provenance contains immutable references to all build steps, from the build system
At L4, leave hermetic and dependencies complete.

Im less sure about the wording for dependencies, but here's a rough take:

At L2 or 3 (again, unsure yet): Dependencies may be fetched over the network, but the provenance contains immutable references to all the dependencies, captured by the build system.
At L4, leave hermetic and dependencies complete.

@MarkLodato
Copy link
Member

How would the build system verify this and put it in the provenance?

Here's a concrete example: what if the cloudbuild.yaml (or whatever) had a single step docker build, and that Dockerfile then fetched dependencies using curl? How would the build system detect this? Or if this is out of scope, what wording would we use to explain that?

@dlorenc
Copy link

dlorenc commented Jun 23, 2021

Are you referring to the build step case? Or the dependency one?

The build step would be something like "gcr.io/cloud-builders/docker", and the arguments would be "build". The docker build step doesn't necessarily need to be pinned by digest - the build system can instead resolve that to a digest and include it in the final provenance.

@MarkLodato
Copy link
Member

Let's use an actual, working example: https://github.com/MarkLodato/example-build. Here, the recipe is "GitHub Actions Hosted Workflow", as produced by our L1 demo (though the demo is not working right now).

Dependencies are fetched at three layers:

  1. The GitHub Actions Workflow depends on various actions.
  2. Bazelisk fetches the version of Bazel.
  3. Bazel fetches the version of rules_go.

Now, let's look at those two requirements:

Provenance contains immutable references to all build steps, from the build system

In the example above, would you interpret this including only (1)? That seems doable, though we'll have to figure out how to clearly explain that delineation.

Dependencies may be fetched over the network, but the provenance contains immutable references to all the dependencies, captured by the build system.

So here this would include (2) and (3)? If so, how could that be implemented? GitHub Actions doesn't have any ability to peek into the subprocess. The only way to do that would be some sort of sandboxing or hermetic build, right?

@dlorenc
Copy link

dlorenc commented Jun 24, 2021

In the example above, would you interpret this including only (1)? That seems doable, though we'll have to figure out how to clearly explain that delineation.

Yes, exactly this. The idea is that if the workflow step is a container pinned by digest, and we have all the parameters passed into the container, we could do a decent job of guessing what version of bazel bazelisk will fetch, and what version of rules_go bazel will fetch.

It's not perfect - but it's better than nothing, especially if you're using build steps that you have some kind of trust/control over.

We'd need to be completely hermetic to get perfect reproducibility here, but that's why the gradient is nice.

Step 1: pin the build steps.
Step 2: pin the dependencies inside the build step, but still allow fetching over the network (the bazel rules_go example should work this way using the WORKSPACE).
Step 3: let the build system do the fetching, and run the build hermetically. This would look something like having the build system itself run "bazel fetch" before the build starts.

@MarkLodato
Copy link
Member

I can see adding a requirement for step 1, but I don't yet understand how step 2 could be implemented.

Do you agree with the following?

We have two requirements:

  • Version Pinning (immutable reference in source, formerly called Hermetic)
  • Dependencies (immutable reference in provenance, formerly called Dependencies Complete)

And each has a level of "completeness":

  • none
  • top-level source only (n/a for pinning)
  • direct deps of the build script, e.g. github actions
  • (some intermediate thing you describe as step 2, but that I'm still not sure is practical to define)
  • all transitive dependencies

The question is how much of each to require at each level?

I'm also thinking that maybe we can just remove the "runs with no network access" as a requirement, and that could perhaps be one way to implement the above.

@dlorenc
Copy link

dlorenc commented Jun 24, 2021

I think trying to keep "build steps/tools" and "dependencies" in the same conceptual category is causing difficulty here. They're both "inputs" to a build, but the relationship (and interaction with the build systems) are completely different.

I agree it's harder to do for "source" dependencies than it is for "build steps/tools", but I think there's value in separating them anyway.

@joshuagl
Copy link
Member

I agree that there's value in capturing locked dependencies and hermetic builds as distinct concepts. For many developers, pinned/locked dependencies are increasingly common, but capture only the ecosystem dependencies – i.e. I have my GitHub project where dependabot keeps the direct and transitive dependencies for my language ecosystem up-to-date.

The interesting aspect of the definition of hermetic, which is a gap for many developers I have spoken with, is requiring that the build be insensitive to the host system running the build and the notion of capturing the dependencies from the host system (compiler and tools used from the host, libraries linked to from the host, etc).

For GitHub Actions, I think the closest we can get to hermetic is using a container action and providing some guidance on how to create container images with a high SLSA level.

@NicoleSchwartz
Copy link
Contributor

Hey - do we define user-build script anywhere?

@MarkLodato MarkLodato added this to the SLSA spec backlog milestone Jan 17, 2023
@MarkLodato MarkLodato added spec-change Modification to the spec (requirements, schema, etc.) slsa 4 Applies to a SLSA 4 requirement labels Jan 17, 2023
@MarkLodato
Copy link
Member

Hey - do we define user-build script anywhere?

No. In SLSA v1.0 (draft) we no longer use that term.

I think this issue still stands for the future SLSA Build L4 (post v1.0), regardless of whether we say "build script" or something else.

@kpk47 kpk47 moved this to 🆕 New in Issue triage May 25, 2023
@kpk47 kpk47 moved this from 🆕 New Issues to 📋 Backlog in Issue triage May 25, 2023
@kpk47 kpk47 moved this from 📋 Backlog to Untriaged in Issue triage Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
slsa 4 Applies to a SLSA 4 requirement spec-change Modification to the spec (requirements, schema, etc.)
Projects
Status: Untriaged
Development

No branches or pull requests

6 participants