Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad concurrency for prod deploys #1245

Open
janbrasna opened this issue Apr 20, 2024 · 5 comments · May be fixed by #1250
Open

Bad concurrency for prod deploys #1245

janbrasna opened this issue Apr 20, 2024 · 5 comments · May be fixed by #1250

Comments

@janbrasna
Copy link
Contributor

With many PRs merged in succession and the time taken in CI before checking out and trying to push to gh-pages after building, if there are more jobs running at the same time, you obviously run into the issue:

[gh-pages d8c30f64] Deployed with mkdocs, version 1.1.2 from /home/circleci/.local/share/virtualenvs/code-6yRgnUSz/lib/python3.8/site-packages/mkdocs (Python 3.8)
 552 files changed, 859 insertions(+), 859 deletions(-)
To github.com:fastlane/docs.git
 ! [rejected]          gh-pages -> gh-pages (fetch first)
error: failed to push some refs to '[email protected]:fastlane/docs.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

Exited with code exit status 1

So you effectively end up not having the merged changeset published at that point. You can only hope the next push to master won't take too long to happen, to incorporate all the previous (failed) deploys to prod with it… 🤷

I'm not a CircleCI expert so take this with a pinch of salt, but… it seems the earliest commit "wins" here trying to deploy to prod, whereas normally you'd have the most recent cancelling the previous ones and eventually "winning" in the priority to deploy, not being blocked by the previous ones running concurrently to cause conflicts at the end.

@rogerluan
Copy link
Member

Interesting issue! Despite being less ideal, in this case I think we could fix it by having newest commits cancelling previous ongoing builds 👀 that'd effectively solve the problem and I don't see significant drawbacks.

Not sure how to achieve this with CircleCI though, and I won't have time to investigate this any time soon 😥 happy to review PRs or other changes in the meantime though!

@janbrasna
Copy link
Contributor Author

janbrasna commented Jun 6, 2024

I'm used to the behaviour needed in GHA but it seems it's not exactly that straightforward in CircleCI:

So my take would simply be:

git commit -m "Deployed with $(mkdocs --version)"
git push origin gh-pages

--force

but only in master context / publish CI, not when run otherwise, manually/localy etc. as there might be more users of the script — so I'm not confident to just propose -f there and call it a day. Leaving that to others to come up with something maybe more sophisticated;]

(This would be still far from perfect, as that doesn't prefer the build that starts last, but one that finishes last, and that's a huge difference;)… throw in some timeout, connection/performance or cache woes like lately, and you can have an older commit overwriting the output of a newer one just by getting stuck for a bit longer in there…) 🤷‍♂️

@rogerluan
Copy link
Member

Thanks for digging that info for CircleCI. It seems like they don't offer "auto cancel builds" which's kinda underwhelming 🤕 I wouldn't expect that.

Some alternative solutions:

  • Do the deploy in a different CI (probably possible in their free tier, given that we barely deploy), even e.g. GHA.
  • Use -f but then also have a cron job that re-deploys once a day just in case 🤷
  • Restart the deployment in case it fails during that step? Basically catch the error, and treat it by retrying. Retry a given amount of times, e.g. 3, 5…
  • Only deploy when creating tags (I dislike this option as it actually decreases the deployment frequency and adds an extra step for us maintainers to deploy changes 🙈 )

Thoughts?

@janbrasna
Copy link
Contributor Author

Yea we've had race conditions e.g. where a workflow would need a docker built from the same sha that might not have already been published to the registry, so the cron fallback for failed pipelines sounds uncomfortably familiar;]

The build is simple enough to be pushed straight to a deployment environment via GHA, getting rid of the gh-pages branch and its underlying git tree completely, and I'd welcome that — but I don't think you can depend GHA running only if previous checks i.e. CircleCI build&test pass. The containerised fastlanetools/ci test image is just docker anyways so that shouldn't be too prohibitive to move that also to GHA, keeping the whole CI just here… but it would mean disjoining pipelines from fastlane/fastlane which is kinda 💩…

@janbrasna janbrasna linked a pull request Jun 10, 2024 that will close this issue
@janbrasna
Copy link
Contributor Author

But the problem is pretty trivial in this case. The bundler woes slowed down the CI and it took ~10mins and more from initial checkout to the actual switch & commit step, so before resorting to bigger changes or force pushing I'd just try #1250 adding an extra fetch — to check out fresh gh-pages tip instead of the head that's been lying around for minutes already… (at the same time the current bundler version resolves take only seconds, so that should help avoiding conflicts too…)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants