-
Notifications
You must be signed in to change notification settings - Fork 941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci(framework:skip) Add e2e test to test Docker images #4218
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Robert Steiner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Robert-Steiner. I like the straightforward Docker test in _docker-test.yml
, but I've a couple of questions related to:
- Matrix testing. Do we plan to extend it to a matrix strategy for different Python versions, architectures, and client authentication?
- What
flwr
version is used. - Differences between
base
andbinary
images.
"""Generates a matrix comprising Ubuntu and Alpine images. For Alpine, the matrix | ||
includes only the latest supported Python version, whereas for Ubuntu, it includes all | ||
supported Python versions. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Robert-Steiner Can you please elaborate why we don't build Alpine images for the supported Python versions? Is it because for the image testing, we want to test only the latest supported Python version on x86_64
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't build Alpine images for the supported Python versions because we want to keep the installation of dependencies simple. Alpine uses musl
instead of glibc
, and most ML frameworks don't provide binaries that are compiled for musl
, which means they must be compiled when building the image. This process can take more than 2 hours depending on the framework and hardware, making it a time-consuming and resource-intensive task.
As a result, we only use the Alpine base image for building images where we think the user doesn't need to install any dependencies, such as the SuperLink and SuperNode image. For images where the user might need to install dependencies, we provide Ubuntu images for all supported Python versions.
The build_complete_matrix
function generates the matrix when a new flwr
version is released. For testing we will use the simplified matrix (build_ubuntu_python_latest_matrix
) that only builds Ubuntu images with the latest supported Python version because building Ubuntu images is faster and cheaper that Alpine images.
The Docker workflows (release
, nightly
, main
aka. unstable
) will always build the images for amd64
and arm64
. However, for the Docker e2e test workflow, we agreed to only build and test the amd64
version.
Later, we could also test Alpine images and arm64
architecture, but for now, we just want a simple test to ensure the cron job doesn't produce a broken image. #4092 (review)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Robert-Steiner Got it. This is a very clear explanation - let's document it somewhere (e.g. in build-docker-image-matrix.py
?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the documentation in #4422
# parameters: | ||
# if: github.repository == 'adap/flower' | ||
# name: Collect docker build parameters | ||
# runs-on: ubuntu-22.04 | ||
# timeout-minutes: 10 | ||
# outputs: | ||
# pip-version: ${{ steps.versions.outputs.pip-version }} | ||
# setuptools-version: ${{ steps.versions.outputs.setuptools-version }} | ||
# flwr-version-ref: ${{ steps.versions.outputs.flwr-version-ref }} | ||
# matrix: ${{ steps.versions.outputs.matrix }} | ||
# steps: | ||
# - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1 | ||
|
||
# - uses: ./.github/actions/bootstrap | ||
# id: bootstrap | ||
|
||
# - id: versions | ||
# run: | | ||
# echo "pip-version=${{ steps.bootstrap.outputs.pip-version }}" >> "$GITHUB_OUTPUT" | ||
# echo "setuptools-version=${{ steps.bootstrap.outputs.setuptools-version }}" >> "$GITHUB_OUTPUT" | ||
# echo "flwr-version-ref=git+${{ github.server_url }}/${{ github.repository }}.git@${{ github.sha }}" >> "$GITHUB_OUTPUT" | ||
# python dev/build-docker-image-matrix.py --flwr-version unstable --simple > matrix.json | ||
# echo "matrix=$(cat matrix.json)" >> $GITHUB_OUTPUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these lines be uncommented once the parameters are fully defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be removed. This workflow is just used to run the Docker e2e test workflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've the same response as above.
on: | ||
push: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is quite resource intensive. Maybe we can limit the workflow to push
and pull_request
to main?
on: | |
push: | |
on: | |
push: | |
branches: | |
- main | |
pull_request: | |
branches: | |
- main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be removed. This workflow is just used to run the Docker e2e test workflow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, can the workflow be triggered if we don't specify on:
?
with: | ||
build-args: ${{ inputs.build-args }} | ||
|
||
- id: matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's describe this step:
- id: matrix | |
- name: Create matrix for base image | |
- id: matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we actually don't use a matrix strategy here. Is this planned for another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, we're not directly using the matrix strategy, but instead we generate the matrix using the dev/build-docker-image-matrix.py
script, which then feeds into subsequent jobs and steps. This script-based approach simplifies maintenance and updates by centralizing the matrix generation, making it easier to add, remove, or modify images as needed.
timeout 30 flwr run ${PROJECT_DIR} local-deployment-tls | ||
timeout 120 docker compose logs -f superexec | grep -q 'Run finished' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that this test is very straightforward. Is it still worth for us to run it with pytest
like in #3477? Since your approach can be easily parameterized in a matrix, maybe it makes sense to follow and extend your approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. With pytest
, we can run all tests locally, but if we use a matrix, we'll likely need to run a bash loop to execute all tests. I prefer to use pytest
because it makes it easier to run tests locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, does it makes sense to do this in 2 steps?
- Resolve and merge this PR.
- Migrate the tests to
pytest
.
docker compose -f certs.yml up --build | ||
sudo chown -R 49999:49999 superexec-certificates/* superlink-certificates/* | ||
|
||
cat <<- EOT >> "${PROJECT_DIR}/pyproject.toml" | ||
[tool.flwr.federations.local-deployment-tls] | ||
address = "127.0.0.1:9093" | ||
root-certificates = "../superexec-certificates/ca.crt" | ||
EOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we include a matrix of tests for secure/insecure connections and client authentication later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can include those tests when we in the Reinstate E2E tests for Docker
RFC.
Co-authored-by: Chong Shen Ng <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the replies @Robert-Steiner. I've several follow-up questions.
@@ -0,0 +1,104 @@ | |||
name: Reusable docker image test workflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name: Reusable docker image test workflow | |
name: Reusable Docker image test workflow |
runs-on: "ubuntu-22.04" | ||
timeout-minutes: 10 | ||
steps: | ||
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of using a hash instead of the version (as we've previously used in other CI steps)? Is it so that we can use a minor+patched version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also see this repeated in other steps like for docker@setup-buildx-action
and Wandalen/wretry.action
.
timeout 30 flwr run ${PROJECT_DIR} local-deployment-tls | ||
timeout 120 docker compose logs -f superexec | grep -q 'Run finished' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, does it makes sense to do this in 2 steps?
- Resolve and merge this PR.
- Migrate the tests to
pytest
.
on: | ||
push: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, can the workflow be triggered if we don't specify on:
?
# parameters: | ||
# if: github.repository == 'adap/flower' | ||
# name: Collect docker build parameters | ||
# runs-on: ubuntu-22.04 | ||
# timeout-minutes: 10 | ||
# outputs: | ||
# pip-version: ${{ steps.versions.outputs.pip-version }} | ||
# setuptools-version: ${{ steps.versions.outputs.setuptools-version }} | ||
# flwr-version-ref: ${{ steps.versions.outputs.flwr-version-ref }} | ||
# matrix: ${{ steps.versions.outputs.matrix }} | ||
# steps: | ||
# - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1 | ||
|
||
# - uses: ./.github/actions/bootstrap | ||
# id: bootstrap | ||
|
||
# - id: versions | ||
# run: | | ||
# echo "pip-version=${{ steps.bootstrap.outputs.pip-version }}" >> "$GITHUB_OUTPUT" | ||
# echo "setuptools-version=${{ steps.bootstrap.outputs.setuptools-version }}" >> "$GITHUB_OUTPUT" | ||
# echo "flwr-version-ref=git+${{ github.server_url }}/${{ github.repository }}.git@${{ github.sha }}" >> "$GITHUB_OUTPUT" | ||
# python dev/build-docker-image-matrix.py --flwr-version unstable --simple > matrix.json | ||
# echo "matrix=$(cat matrix.json)" >> $GITHUB_OUTPUT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've the same response as above.
"""Generates a matrix comprising Ubuntu and Alpine images. For Alpine, the matrix | ||
includes only the latest supported Python version, whereas for Ubuntu, it includes all | ||
supported Python versions. | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Robert-Steiner Got it. This is a very clear explanation - let's document it somewhere (e.g. in build-docker-image-matrix.py
?).
Issue
Description
This PR adds a new reusable workflow to run a simple e2e test in docker compose. This workflow ensures that our images are tested and we are not releasing broken images.
Test Execution:
e2e-test
tag.complete
with-TLS Docker Compose setup.Related issues/PRs
Proposal
Explanation
Checklist
#contributions
)Any other comments?