Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfiles: run /bin/detect and /bin/build for extensions #786

Closed
wants to merge 12 commits into from

Conversation

natalieparellano
Copy link
Member

@natalieparellano natalieparellano commented Dec 15, 2021

This PR updates the lifecycle to:

  • Run /bin/detect for extensions as part of regular detect
  • Run /bin/build for extensions as a new invocation of build

Note that this PR does NOT tackle actually applying Dockerfiles to the build and/or run image - for the work being done there, see #709

Testing

Update for ease of use, please use the test-extender.sh script. Example invocation: LIFECYCLE_REPO_PATH=~/workspace/lifecycle SAMPLES_REPO_PATH=~/workspace/samples/ ./test-extender.sh

The following assumes that github.com/buildpacks/samples is checked out to the dockerfiles-poc branch

cd $LIFECYCLE_REPO_PATH

docker image rm test-builder --force

make clean build-linux-amd64

cd out/linux-amd64

cat << EOF > Dockerfile
FROM cnbs/sample-builder:bionic
COPY ./lifecycle /cnb/lifecycle
EOF

docker build -t test-builder .

cd $SAMPLES_REPO_PATH

docker run \
  -v $PWD/workspace/:/workspace \
  -v $PWD/layers/:/layers \
  -v $PWD/platform/:/platform \
  -v $PWD/cnb/ext/:/cnb/ext \
  test-builder \
  /cnb/lifecycle/detector -order /layers/order.toml -log-level debug

docker run \
  -v $PWD/workspace/:/workspace \
  -v $PWD/layers/:/layers \
  -v $PWD/platform/:/platform \
  -v $PWD/cnb/ext/:/cnb/ext \
  test-builder \
  /cnb/lifecycle/builder -use-extensions -log-level debug

You should see:

  • Detect passes, ./layers/group.toml is created with samples/curl and samples/rebasable; empty ./layers/plan.toml is created
  • Build passes, ./layers/config/metadata.toml is created with the "buildpacks" (really extensions) that participated in the build and a list of Dockerfiles that should be applied by the build and/or run extender

Signed-off-by: Natalie Arellano <[email protected]>
Signed-off-by: Natalie Arellano <[email protected]>
Signed-off-by: Natalie Arellano <[email protected]>
Signed-off-by: Natalie Arellano <[email protected]>
Signed-off-by: Natalie Arellano <[email protected]>
Signed-off-by: Natalie Arellano <[email protected]>
Signed-off-by: Natalie Arellano <[email protected]>
@natalieparellano
Copy link
Member Author

natalieparellano commented Dec 15, 2021

Notes for RFC/spec

  • Questions around how extensions should be identified in order.toml and group.toml (see RFC comment)
  • What should be the format of the file that we’re using to pass the Dockerfile paths & args to the extender? And what should be the path (currently using /layers/config/metadata.toml, but we probably want a separate file for extensions and buildpacks)
  • optional=false is the default value for buildpacks, but optional=true is the default value for extensions - how to handle this cleanly?
  • We need the build phase for extensions to output plan.toml so that we can filter the plan for buildpacks
  • What should be the default parent dir for extensions output - maybe /layers/config/ext?

Code things to be done

  • detector should error if an extension contains an order
  • builder should execute extensions in parallel
  • builder should process SBOM files output by extensions (they are currently ignored)
  • builder should error if a regular buildpack tries to output dockerfiles
  • ...a bunch more

Code style questions

  • It’s a bit weird, dumping all extension-related stuff in extension.go when buildpacks get their own build.go and detect.go; maybe build.go could become more “generic” and buildpacks and extensions could hook into it separately?
  • Does store belong in the buildpack package?
  • Do we like the variable renames? Should we rename more things?

It’s a pretty big diff because I renamed some variables to be more generic to a thing that could be a buildpack OR an extension. The most interesting files are:

  • buildpack/build.go
  • buildpack/detect.go
  • buildpack/dockerfile.go
  • buildpack/extension.go

@cmoulliard
Copy link

In short and, if I can still read go code, what you did within this PR has been to:

  • Extend the lifecycle bin/build step to either call the extension/build step of an extension able to generate a Dockerfile OR to parse dockerfiles under /path/path/Dockerfile
  • For each Dockerfile discovered, an object has been created containing:
    • path of the dockerfile,
    • ARGS to be passed to the Dockerfile,
    • build or run bool
      and append them to a list of Dockerfiles

Questions:

  • Is my assumption correct ? If yes, when will the list of the Dockerfiles be parsed by buildah or kaniko as I dont see within the PR a comment within the code where such step could take place ?
  • How will it be possible to tell to the lifecycle to parse the DockerfileA and not DockerfileB as by example it can install maven as that was declared part of the detect step as requirement ?
  • Will it be possible to map the layer(s) created for a Dockerfile parsed to support a caching mode avoiding to parse a Dockerfile during every build ?
  • Will it be possible to pass ENV vars to configure the Dockerfiles ARGS ?
  • Will the lifecycle build fail if one of the Dockerfiles processed/parsed throws an exception ?
  • Why do you populate the list of the Dockerfiles to be parsed during the build step and not detect step ?

@natalieparellano

@natalieparellano
Copy link
Member Author

natalieparellano commented Dec 16, 2021

Thanks @cmoulliard for your comments. Some thoughts:

when will the list of the Dockerfiles be parsed by buildah or kaniko

This "glue code" is yet to be written. I think we'll want to align on what format the "extender" binary is expecting to find this information.

How will it be possible to tell to the lifecycle to parse the DockerfileA and not DockerfileB

I think that is the job of the extension to decide whether to detect and/or output a Dockerfile or not, based on the source code and the build plan it is given.

Will it be possible to map the layer(s) created for a Dockerfile parsed to support a caching mode avoiding to parse a Dockerfile during every build ?

In principle I think so, but we haven't given it too much thought yet. This is probably one of the next things to tackle.

Will it be possible to pass ENV vars to configure the Dockerfiles ARGS ?

Do you mean pass env vars to the extension /bin/detect and/or /bin/build? Or to the "extender" binary that applies the Dockerfiles? I think both are possible.

Will the lifecycle build fail if one of the Dockerfiles processed/parsed throws an exception ?

I think that's up to the RFC and/or spec to decide - we should note it as an open question. (cc @jkutner)

Why do you populate the list of the Dockerfiles to be parsed during the build step and not detect step ?

Since extensions and buildpacks collaborate on the build plan to pass or fail detect, we need to run /bin/build for extensions after the entire detect phase has finished. But there's really no need to spin up an entire separate container to run /bin/build for extensions, because no additional privileges are required. We could probably create a "wrapper" binary - similar to the creator - that runs detect and then runs build (for extensions only).

@cmoulliard
Copy link

I think that is the job of the extension to decide whether to detect and/or output a Dockerfile or not, based on the source code and the build plan it is given.

This point must be clarified as currently I do not see/understand how steps will take place. We need a sequence diagram !

@cmoulliard
Copy link

Do you mean pass env vars to the extension /bin/detect and/or /bin/build? Or to the "extender" binary that applies the Dockerfiles? I think both are possible.

Passing ENV vars to the component parsing the Dockerfiles and generating the layer

@cmoulliard
Copy link

Since extensions and buildpacks collaborate on the build plan to pass or fail detect, we need to run /bin/build for extensions after the entire detect phase has finished. But there's really no need to spin up an entire separate container to run /bin/build for extensions, because no additional privileges are required. We could probably create a "wrapper" binary - similar to the creator - that runs detect and then runs build (for extensions only).

Again, this is something which should be clarified to describe which component (Lifecycle, ...) will be responsible to perform which work, the boundaries of the component, how they communicate ...

@cmoulliard
Copy link

cmoulliard commented Jan 14, 2022

  1. Different paths to access the Dockerfile

Why are the path different within the following ./layers/config/metadata.toml example ?

[[dockerfiles]]
  extension_id = "samples/curl"
  path = "/layers/samples_curl/Dockerfile"
  build = true
  run = true

  [[dockerfiles.args]]
    name = "some_arg"
    value = "some-arg-build-value"

  [[dockerfiles.args]]
    name = "some_arg"
    value = "some-arg-launch-value"

[[dockerfiles]]
  extension_id = "samples/rebasable"
  path = "/cnb/ext/samples_rebasable/0.0.1/Dockerfile"
  build = true
  run = true
  1. Naming convention

Why do you generate the content of the extensions under [[buildpacks]] and not [[extensions]] within ./layers/config/metadata.toml ?
Example:

[[buildpacks]]
api = "0.7"
homepage = "https://github.com/buildpacks/samples/tree/main/extensions/curl"
id = "samples/curl"
version = "0.0.1"
extension = true
  1. When dockerfiles.args will be used

Will the dockerfiles.args only used by the extension executable when the image is build and/or when a container is launched using the run image ?

  1. Clarification

How can the extension know when they must use hereafter some-arg-build-value or some-launch-build-value as both have the name some_arg ?

[[dockerfiles.args]]
name = "some_arg"
value = "some-arg-build-value"

[[dockerfiles.args]]
name = "some_arg"
value = "some-arg-launch-value"

See test case to engage the discussion: https://github.com/redhat-buildpacks/poc/blob/9e0a5f9f5fb5576b2c5b71314e0f8c7d8cbb2aed/buildah/code/parse/toml_test.go#L11

  1. Other

Is it for debugging purpose that you read the files

config.Logger.Debug("Reading output files")
return e.readOutputFiles(bpOutputDir, bpPlan, config.Logger)
?

@natalieparellano

@natalieparellano
Copy link
Member Author

@cmoulliard thanks for taking another look! Let me see if I can address your comments -

  1. The Dockerfile with path = "/layers/samples_curl/Dockerfile" was output by an extension that had a bin/build executable, and hence it was created at build time. The Dockerfile with path = "/cnb/ext/samples_rebasable/0.0.1/Dockerfile" is from an extension that did NOT have a bin/build executable, and hence the extension's root directory was treated as a pre-populated output directory.

Why do you generate the content of the extensions under [[buildpacks]] and not [[extensions]] within ./layers/config/metadata.toml ?

Total laziness... the schema for /layers/config/metadata.toml has not been spec'd (or even discussed really) as part of the RFC process. I think it's fair to say that's a conversation we would like to have. But we can provide our input as to what we think the file should look like.

Will the dockerfiles.args only used by the extension executable when the image is build and/or when a container is launched using the run image ?

I think only the extender binary needs to care about Dockerfile build args.

How can the extension know when they must use hereafter some-arg-build-value or some-launch-build-value as both have the name some_arg ?

I think that was my mistake in putting together this file. You are right, the extender binary needs some way to know which build arg is applicable when extending the build or run image. Maybe we could have something like [[dockerfiles.args.build]] and [[dockerfiles.args.run]]?

Is it for debugging purpose that you read the files

e.readOutputFiles(bpOutputDir, bpPlan, config.Logger) actually doesn't (I don't think) read the Dockerfiles output by extensions. This function deals with build.toml, launch.toml, and any sbom files that might be output by extensions. It is all about the building up of /layers/config/metadata.toml.

@cmoulliard
Copy link

POC supports to read a toml file - redhat-buildpacks/poc@1ee100a#diff-443c22fd43332d23d4bc2d3f895875af24669e7053bf1fa2e41b4465ec834adaR186. I will now pass the parameters to the dockerfile.

Should we pass them as ARG or ENV to the process building the Dockerfile ? @natalieparellano

@cmoulliard
Copy link

[[dockerfiles.args.build]] and [[dockerfiles.args.run]]?

+1

@cmoulliard
Copy link

cmoulliard commented Jan 14, 2022

  1. From Arg

How will the From ARG be passed to the process parsing the Dockerfile as it is needed here ?

ARG base_image
FROM ${base_image}

@natalieparellano

@natalieparellano
Copy link
Member Author

@cmoulliard I think it would be easiest if the process that reads the TOML file and the process that applies the Dockerfile are one and the same. Is that possible with your POC?

@cmoulliard
Copy link

cmoulliard commented Jan 14, 2022

Is that possible with your POC?

Yes. The poc now is processing a metadata.toml file under $WORKSPACE + /layers/ - see here: https://github.com/redhat-buildpacks/poc/blob/main/buildah/code/main.go#L184-L203

@cmoulliard
Copy link

Is it an issue if we have to pass such a docker parameter using pack and buildah extension --security-opt seccomp=unconfined ? @natalieparellano @sclevine

@cmoulliard
Copy link

Do you agree that we use your PR to merge the code of the POC able to parse/build dockerfile(s) ? @natalieparellano

…pply to both

build and run will be repeated in the list.

Signed-off-by: Natalie Arellano <[email protected]>
@cmoulliard
Copy link

Are the buildpack samples (branch dockerfile-poc) compatible with this commit ad323ee ? @natalieparellano

@cmoulliard
Copy link

cmoulliard commented Jan 20, 2022

I did a test to build lifecycle using last commit pushed (= 151d752) but that fails. A module reference is hard coded at this line and should be fixed !

@aemengo

make clean build-linux-amd64
> Cleaning workspace...
rm -rf /Users/cmoullia/code/cncf/lifecycle/out
> Building lifecycle/lifecycle for linux/amd64...
mkdir -p /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle
GOARCH=amd64 CGO_ENABLED=0 go build  -ldflags "-s -w -X 'github.com/buildpacks/lifecycle/cmd.SCMRepository=github.com/buildpacks/lifecycle' -X 'github.com/buildpacks/lifecycle/cmd.SCMCommit=151d7528' -X 'github.com/buildpacks/lifecycle/cmd.Version=0.13.1-12+151d7528'" -o /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/lifecycle -a ./cmd/lifecycle
> Creating phase symlinks for linux/amd64...
ln -sf lifecycle /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/detector
ln -sf lifecycle /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/analyzer
ln -sf lifecycle /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/restorer
ln -sf lifecycle /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/builder
ln -sf lifecycle /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/exporter
ln -sf lifecycle /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/rebaser
ln -sf lifecycle /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/creator
> Building lifecycle/launcher for linux/amd64...
mkdir -p /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle
GOARCH=amd64 CGO_ENABLED=0 go build  -ldflags "-s -w -X 'github.com/buildpacks/lifecycle/cmd.SCMRepository=github.com/buildpacks/lifecycle' -X 'github.com/buildpacks/lifecycle/cmd.SCMCommit=151d7528' -X 'github.com/buildpacks/lifecycle/cmd.Version=0.13.1-12+151d7528'" -o /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/launcher -a ./cmd/launcher
test $(du -m /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/launcher|cut -f 1) -le 3
> Building lifecycle/extender for linux/amd64...
mkdir -p /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle
cd ./extender && GOARCH=amd64 CGO_ENABLED=0 go build  -ldflags "-s -w -X 'github.com/buildpacks/lifecycle/cmd.SCMRepository=github.com/buildpacks/lifecycle' -X 'github.com/buildpacks/lifecycle/cmd.SCMCommit=151d7528' -X 'github.com/buildpacks/lifecycle/cmd.Version=0.13.1-12+151d7528'" -o /Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/extender -a .
go: github.com/redhat-buildpacks/poc/[email protected] (replaced by ../../poc/kaniko/code): reading ../../poc/kaniko/code/go.mod: open /Users/cmoullia/code/cncf/poc/kaniko/code/go.mod: no such file or directory
go: github.com/redhat-buildpacks/poc/[email protected] (replaced by ../../poc/kaniko/code): reading ../../poc/kaniko/code/go.mod: open /Users/cmoullia/code/cncf/poc/kaniko/code/go.mod: no such file or directory
make: *** [/Users/cmoullia/code/cncf/lifecycle/out/linux-amd64/lifecycle/extender] Error 1

@aemengo
Copy link
Contributor

aemengo commented Jan 20, 2022

@cmoulliard Thanks for giving it a try. That path points to a relative path on the file-system and is how I can reference your poc code on this branch. I'd be happier referencing the dependency more conventionally, when things are a little "cleaner".

If your poc repo can be copied to /Users/cmoullia/code/cncf/poc, things should work. I'm also happy to accomodate how you see fit.

@cmoulliard
Copy link

I'd be happier referencing the dependency more conventionally, when things are a little "cleaner".

Why dont we include the Kaniko's POC code and that we continue to improve it here (= this PR) ?

Signed-off-by: Anthony Emengo <[email protected]>
@natalieparellano
Copy link
Member Author

@cmoulliard apologies for breaking you! I'll push up an intermediate fix that should make it compilable and then we'll work on pushing the changes upstream.

Signed-off-by: Natalie Arellano <[email protected]>
@natalieparellano
Copy link
Member Author

Thanks to the work of @aemengo and @cmoulliard we can now see the lifecycle running detect for buildpacks and extensions, running build for extensions, extending the build image, and running build for buildpacks in the context of the extended image!

To 🎉 see this in practice 🎉 , ensure that the buildpacks/samples repo is checked out at the dockerfiles-poc branch. Then:
LIFECYCLE_REPO_PATH=~/workspace/lifecycle SAMPLES_REPO_PATH=~/workspace/samples/ ./test-extender.sh

The output here indicates that curl was added correctly by the extension :)

Running build for buildpack Buildpack using curl 0.0.1
Preparing paths
Running build command
Running 'curl google.com' ......




  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   219  100   219    0     0   1531      0 --:--:-- --:--:-- --:--:--  1531
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>

@cmoulliard
Copy link

we can now see the lifecycle running detect for buildpacks and extensions, running build for extensions, extending the build image, and running build for buildpacks in the context of the extended image!

  • Are you sure that this module reference is correct as it still points to kaniko v1.7.0 and not to the git commit containing the PR merged to bump the version of docker ?
  • The bash script or instructions should explain how to fix this replace statement https://github.com/redhat-buildpacks/poc/kaniko = ./code as it assumes for the moment that you have git cloned the POC repo locally.
  • Question: When do you want to integrate the Kaniko POC code top of this PR ?
    @natalieparellano

@natalieparellano natalieparellano mentioned this pull request Feb 15, 2022
@natalieparellano
Copy link
Member Author

I'm going to close this as all the work is captured in #802

@natalieparellano natalieparellano deleted the dockerfiles-poc-2 branch March 30, 2022 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants