Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Speed up repository builds in the test suite #1202

Merged
merged 4 commits into from
Dec 3, 2020

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Nov 3, 2020

@consideRatio consideRatio added the maintenance Under the hood improvements and fixes label Nov 3, 2020
@consideRatio consideRatio requested a review from minrk November 4, 2020 00:04
@consideRatio
Copy link
Member Author

The thing to consider about this PR is about the removal of the gitlab build. I think it is far more valuable to save any local developer or someone waiting for the tests to complete to save time than to do what repo2docker already does in its tests.

@consideRatio consideRatio force-pushed the pr/ci-test-build-speedup branch from 2495818 to 492a728 Compare November 4, 2020 00:58
@betatim
Copy link
Member

betatim commented Nov 5, 2020

As I wrote in the issue: I disagree about removing the GitLab test. For local development you can simply not run tests you don't care about/only run tests you care about. For CI runs saving a few minutes is not worth it because CI minutes are something we have a lot of compared to human time required to react to downtime on mybinder.org. The BinderHub repo is special in the sense that we have to have a very high probability that master will "just work" on mybinder.org so that we can constantly deploy it without people wondering "will deploying this break mybinder.org or not? I am not sure and I don't have 1hr to watch and react to a a breakage so I'd rather not deploy now". This is a problem because this leads to the "diff" between deployments getting larger, which in turn increases the chances of something breaking and the time to then debug it because the diff is so big. Which in turn reduces the likelihood that people choose to deploy changes. So it starts a vicious spiral. We've been in that position before where changes were not deployed for weeks or months. Let's not go back there.

We can speed up the GitLab test by modifying the contents of the repository to for example something like https://github.com/binder-examples/minimal-dockerfile which is about as fast to build as it gets.


Why is there an issue and this PR? It feels like living in a parallel universe because there are two places to discuss the same thing. I think there is no need to create an issue if a PR for that issue is created straight away. Let's either only make a PR or make an issue which is only followed up with a PR after discussion.

@betatim betatim changed the title CI: optimize build speed [WIP] Speed up repository builds in the test suite Nov 5, 2020
@betatim
Copy link
Member

betatim commented Nov 5, 2020

Updated the title of the PR to something more descriptive. WDYT?

"gh/binderhub-ci-repos/requirements/d687a7f9e6946ab01ef2baa7bd6d5b73c6e904fd",
"git/{}/d687a7f9e6946ab01ef2baa7bd6d5b73c6e904fd".format(
"gh/binderhub-ci-repos/requirements/20c4fe55a9b2c5011d228545e821b1c7b1723652",
"git/{}/20c4fe55a9b2c5011d228545e821b1c7b1723652".format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use HEAD here to avoid manually changing the ref for the tests? That also tests the default behavior.

Copy link
Member Author

@consideRatio consideRatio Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the /master test was added, it was added with the motivation of testing a not already full git hash. Do we want to keep testing a full git hash? I suggest not as we compromise on needing to update this repo to avoid a performance degradation if we add another commit to the test repo.

I favor not using any full git commit reference when building images like we do here, but instead stick to using HEAD only. I suggest we retain a full git commit hash in other tests instead to capture that variation if it is relevant to capture.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding a test case for HEAD is a good idea, especially because that is the default now.

I'd also keep the explicit reference case. My reasoning is that it tests a case that isn't the default and infrequently used/developed for. As a developer you might test master/main/HEAD explicitly but rarely think "ah yes, let's make sure explicit commit hashes still work". This makes it a good task for robots because they don't get bored of testing the same stuff over and over again.

Maybe we can remove the manual work of having to update the commit hash by making the test find out the latest hash and then updating the build slug. Like a templated build slug. However we commit so rarely to the CI repos used here that I'd not spend a lot of time on building this automation and instead would add a comment to let performance improvers from the future know that there is a low hanging fruit here.

@consideRatio
Copy link
Member Author

Why is there an issue and this PR? It feels like living in a parallel universe because there are two places to discuss the same thing. I think there is no need to create an issue if a PR for that issue is created straight away. Let's either only make a PR or make an issue which is only followed up with a PR after discussion.

I understand the friction you have experienced @betatim, I mean to avoid that. I first created the issue, and an hour later or so I decided to invest my time to try address it in a PR.

We can speed up the GitLab test by modifying the contents of the repository to for example something like https://github.com/binder-examples/minimal-dockerfile which is about as fast to build as it gets.

Wieee yes! Let's do https://github.com/binder-examples/minimal-dockerfile builds! To not need to rebuild the entire repo2docker boilerplate is the biggest improvement we can get I think.

@betatim
Copy link
Member

betatim commented Nov 5, 2020

an hour later I decided to work on it

nods. It is tricky because you want to create an issue when you think of it and don't know if you'll have time to work on it or not. So when you then shortly after do have time, what to do? I don't know. Maybe we should then post in the issue saying "Please discuss in the PR"? No good ideas from my side :(

Wieee yes! Let's do binder-examples/minimal-dockerfile builds!

Lets keep using the GitLab repository but change its contents, not switch to the binder-examples repo.

@consideRatio
Copy link
Member Author

consideRatio commented Nov 5, 2020

Maybe we should then post in the issue saying "Please discuss in the PR"? No good ideas from my side :(

I think that was a good idea. Updating the issue to reflect there is now an open PR, and declaring where the discussion should go, and in general I figure it should go in the PR where some of the discussion always will need to go as it is where the changes are declared practically.

Action points

  • Define a minimal dockerfile build. I suggest the truly-minimal branch of binder-examples/minimal-dockerfile initially.
  • Reference a GitLab repo with the minimal dockerfile build chosen.
  • Reference a GitHub repo with the minimal dockerfile build chosen.
  • Establish exactly how many times we run test_build, with what arguments.

@consideRatio
Copy link
Member Author

consideRatio commented Nov 5, 2020

How many times do we run test_build, and with what arguments?

I think this is mutually agreed, so I'll make that explicit.

  • To build a minimalistic image with repo2docker, using a custom Dockerfile.
  • To not hardcoding a full git commit hash in this specific test

GitLab or not?

I accept building a GitLab repo alongside the GitHub repo if needed, but would like to have a motivation that it is still needed even though we have a GitLab resolver test in test_repoproviders.py that has been updated several times since test_build added the GitLab test.

Dedicated hash, HEAD, master?

Since we have this test trying to resolve various URLs etc, I think the only thing we need to test in test_build, is if we manage to handle weird URLs in practice, but not the resolution of references.

Due to this, I suggest the test_build function is run twice.

Currently in master

@pytest.mark.parametrize("slug", [
    "gh/binderhub-ci-repos/requirements/d687a7f9e6946ab01ef2baa7bd6d5b73c6e904fd",
    "git/{}/d687a7f9e6946ab01ef2baa7bd6d5b73c6e904fd".format(
        quote("https://github.com/binderhub-ci-repos/requirements", safe='')
    ),
    "git/{}/master".format(
        quote("https://github.com/binderhub-ci-repos/requirements", safe='')
    ),
    "gl/minrk%2Fbinderhub-ci/0d4a217d40660efaa58761d8c6084e7cf5453cca",
])

Suggested

I suggest we combine the dedicated commit hash, the URL quoting of the commit hash, and the master branch reference into one.

@pytest.mark.parametrize("slug", [
    "gh%2Fbinderhub-ci-repos/minimal/HEAD",
    "gl%2Fbinderhub-ci-repos/minimal/HEAD",
])

@consideRatio consideRatio force-pushed the pr/ci-test-build-speedup branch from 492a728 to e5ea660 Compare November 15, 2020 08:45
@consideRatio
Copy link
Member Author

Changes

We keep testing builds with gh/gl/git, but we now use a minimalistic Dockerfile instead.

Outcome

Before: 15m14 + 3m25s + 18m04s
After: 7m30s + 3m11s + 10m23s

Almost twice as fast runs now.

@consideRatio consideRatio changed the title [WIP] Speed up repository builds in the test suite Speed up repository builds in the test suite Nov 15, 2020
@consideRatio consideRatio changed the title Speed up repository builds in the test suite [MRG] Speed up repository builds in the test suite Nov 15, 2020
@consideRatio consideRatio force-pushed the pr/ci-test-build-speedup branch from 4b49b6d to 2ad44de Compare November 17, 2020 15:56
@consideRatio consideRatio changed the title [MRG] Speed up repository builds in the test suite [WIP] Speed up repository builds in the test suite Nov 17, 2020
@consideRatio
Copy link
Member Author

consideRatio commented Nov 17, 2020

Performance testing

pytest --durations=0
I gave pytest the --durations=0 flag to time all tests run. They are then reported in an order where the slowest test are shown first at the end of the test run.

cached-minimal-dockerfile repo
I created a repo cached-minimal-dockerfile repo that is truly minimal as it just references a pre-built minimal-dockerfile in a FROM statement. This got us a docker build speedup from ~20 seconds to ~5 seconds or so on my local computer, which seem to lead to a ~20 second advantage in a GitHub actions runner.

First repo provider build takes ~65-85s (cached) or ~85-105s (non cached)
I found that the first repo provider's to make a build took took ~84 seconds, while the first build of other repo providers took ~67 seconds. I suggest this is because the image-builder pod had already been started once in the cluster.

Second repo provider build requests takes ~5s (assuming no image build pod is started)
For additional tests on each repo provider that doesn't trigger a new build, the test take only ~5 seconds.

Comparative examples

Conclusions

  • For each additional repo provider we add ~65 seconds
  • Using cached-minimal-dockerfile repo over minimal-dockerfile repo we gain ~20 seconds per repo-provider
  • For each variation of HEAD/SHA/Branch of the same commit we add ~5 seconds

Action points

  • Migrate consideratio/minimal-dockerfile and consideratio/cached-minimal-dockerfile to binderhub-ci-repos GitHub github org.
  • Implement one variant
  • Decide on what test_builds variations to run (ping @betatim / @minrk) and update this PR
  • Merge

@consideRatio consideRatio force-pushed the pr/ci-test-build-speedup branch from ae61547 to 413247b Compare November 17, 2020 23:56
@consideRatio
Copy link
Member Author

@betatim @minrk I have documented the new minimal-dockerfile and cached-minimal-dockerfile repos in https://github.com/binderhub-ci-repos/minimal-dockerfile. I want to highlight some outcomes of this setup.

  1. We have a daily automatic build/push of a minimal docker image in https://github.com/binderhub-ci-repos/minimal-dockerfile.
  2. We have two repos named cached-minimal-dockerfile on github/gitlab repo that are will be able to remain frozen in time forever as their Dockerfile only reference the pre-build minimal-dockerfile in a FROM statement and their readme delegate info to the https://github.com/binderhub-ci-repos/minimal-dockerfile repo's readme file.

Together, this makes us able to update the minimal-dockerfile with changes without thinking about commit changes, as we only reference cached-minimal-dockerfile, and by automatically updating the minimal-dockerfile image by daily rebuilds, we ensure that don't start to fail because the built image has become outdated. It also makes our build duration go from ~20s to ~5s.

I have now optimized as much as I can, and it is not a question on how much tests variations should be run. Each additional repo provider we add adds ~65+5 seconds if we do a HEAD+SHA test against a cached-minimal-dockerfile. In the PRs current state we test 3 repo providers with HEAD+SHA.

@consideRatio consideRatio changed the title [WIP] Speed up repository builds in the test suite Speed up repository builds in the test suite Nov 18, 2020
@consideRatio consideRatio changed the title Speed up repository builds in the test suite [MRG] Speed up repository builds in the test suite Nov 23, 2020
@consideRatio consideRatio requested a review from betatim November 23, 2020 11:13
@consideRatio consideRatio requested a review from minrk November 23, 2020 11:13
@consideRatio
Copy link
Member Author

@betatim okay to merge?

@betatim
Copy link
Member

betatim commented Dec 3, 2020

I've not had time to review this yet. Is there a reason to be in a hurry merging this optimisation?

.github/workflows/test.yml Outdated Show resolved Hide resolved
.github/workflows/test.yml Outdated Show resolved Hide resolved
.github/workflows/test.yml Outdated Show resolved Hide resolved
.github/workflows/test.yml Outdated Show resolved Hide resolved
@betatim betatim merged commit 2b921f1 into jupyterhub:master Dec 3, 2020
@betatim
Copy link
Member

betatim commented Dec 3, 2020

Thanks for working on this optimisation

consideRatio pushed a commit to jupyterhub/helm-chart that referenced this pull request Dec 3, 2020
jupyterhub/binderhub#1202 Merge pull request #1202 from consideRatio/pr/ci-test-build-speedup
@consideRatio
Copy link
Member Author

consideRatio commented Dec 3, 2020

Thank you for getting this merged @betatim! ❤️

I've not had time to review this yet. Is there a reason to be in a hurry merging this optimisation?

Its a personal benefit to a large degree. It improves my own focus to avoid context switching as becomes more needed if a PRs active life is longer rather than shorter. I also personally care about having this and #1184 merged before resuming work on #1101 relying on the features part of 0.10.X of the z2jh Helm chart. Of course I can start without them, but it just adds a bit more complexity to manage when having PRs build on other PRs which may be subject to change still etc.

Thank you for your work to maintain this repo Tim!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Under the hood improvements and fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce the image build tests to reduce CI times from ~13 minutes
3 participants