Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Investigate Concourse Alternatives #146

Open
kallisti5 opened this issue Dec 27, 2024 · 11 comments
Open

RFC: Investigate Concourse Alternatives #146

kallisti5 opened this issue Dec 27, 2024 · 11 comments

Comments

@kallisti5
Copy link
Contributor

kallisti5 commented Dec 27, 2024

Looking forward, Concourse is kind of dying as a project. A lot of the issues we see are either concourse limitations (multiple workers for example) , or bugs resulting in random dns resolution issues or even random segfaults.

We should investigate alternatives.

Big picture requirements:

  • Container based builds
    • It gives us consistent artifacts, and offloads storage of toolchain binaries to docker.io, ghcr.io, etc.
  • Support for multiple branches / architectures / nightly / release
  • Ability to push container artifacts, as well as artifacts to s3 buckets
  • Ability for developers to see build results
  • Multiple, reasonably sized workers at remote locations
    • It's a lot easier to do cloud builds, but on-prem builders are substantially cheaper.

Nice to have requirements:

  • Dashboard users can gawk at
@kallisti5
Copy link
Contributor Author

kallisti5 commented Dec 27, 2024

To start, Concourse which we use today:

Pro:

  • yaml based pipelines
  • good remote worker support (when we have one worker 🥴 )
  • containers, s3 uploads, etc.
  • SSO support
  • dashboard
  • free and open source
  • golang
  • The devil we know

Con:

  • Poor flexibility. Concerns around issues we saw with multiple workers shot down (they expect a bunch of workers in a VPC, not isolated locations)
  • Poor reliability. Garden to manage containers is a dead project. containerd support is a bit sketchy and was introduced last minute before the concourse project entered a maintenance mode phase.
  • We had to script out creating pipelines since we needed a fixed pipeline with modified variables.

To sum this one up.. here's the github profile of one of the founding developers of Concourse..

Screenshot From 2024-12-27 11-48-35

@kallisti5
Copy link
Contributor Author

kallisti5 commented Dec 27, 2024

Tekton:

Pros:

  • Container based builds
  • Runs in Kubernetes, with a CRD for build pipelines
  • Seems flexible enough for our complex use cases
  • Can test locally with minikube
  • Completely removes the worker abstraction from the equation. We just add nodes / node pools for builds
  • "It's just kubernetes", so easy to learn.
  • Easy deployment. Components are deployed in chunks with one k8s yaml for each.

Cons:

  • Young project. I feel like the market hasn't embraced it given the limited need for complex CI/CD systems.
    • The same could be said about buildbot though... these complex CI/CD systems in general see limited use since most people aren't building "an entire operating system"
  • Turn-key in Kubernetes... but we'll likely need to figure out a way to plug kubernetes nodes at various remote locations into a central control plane node... adds complexity.
    • Our use-case of remote workers for cost reasons might be a blocker.
  • Dashboard needs operator access to view builds on the local cluster, coupling visibility to the cluster running the builds.

@kallisti5
Copy link
Contributor Author

Noteworthy: Argo. It's slightly more popular than Tekton, but doesn't really add anything we need AND is substantially more complex to implement. It also suffers the same remote worker issue.

@kallisti5
Copy link
Contributor Author

kallisti5 commented Dec 27, 2024

CDS:

Pros:

  • Made by OVH for their internal usage.
  • Supports multiple workers including Kubernetes, docker, and local
  • Looks extremely flexible.
  • Nice command line tooling

Cons:

  • Complexity to host it looks high with multiple systems to sustain the environment. (receipts). 12 things make up the stack.. including ElasticSearch.
  • The learning curve looks a bit steep with lots of "cds-isms"
  • Since it's managed by OVH for their own internal usage, might be resistant to bug reports unimportant to them.
  • If OVH stopped using it... the project would likely suffer a rapid death.

@kallisti5
Copy link
Contributor Author

Buildbot:

Pros:

  • Custom build pipelines, it's just python.
  • Supports complex designs and remote workers with a single python server
  • Keeps build stuff out of kubernetes

Cons:

  • They did a major overhaul of the python API a few years ago when we used to use it (beyond python2->3). Given the complexities of our pipelines, we needed to pretty much start over on 1000's of lines of python if we wanted to upgrade. This is why we originally moved away from it. I have some major trust issues with buildbot.
  • Project is active, but not active. I don't actually see it used many places anymore (I think a lot a people got burned with the API changes and migrated away tbh)

@kallisti5
Copy link
Contributor Author

kallisti5 commented Dec 27, 2024

Jenkins: Because i'm sure it will come up. No.

Pros:

  • Plugins for everything. Enterprise Java shops love it.
  • Massive Enterprise userbase
  • It won't die, no matter how many people hate it.

Cons:

  • Build pipelines built in a web ui, export to massive XML documents.
  • Pipelines are poorly repeatable
  • I've seen a lot of Jenkins build systems historically, I don't think i've ever seen a good design that doesn't involve piles of shell scripts hitting Jenkins soap APIs with XML.

Jenkins 2.0 doesn't change much.

@kallisti5
Copy link
Contributor Author

Github Actions:

Pros:

  • Wide Industry acceptance, lots of people can wrangle it.
  • Build pipelines in source code in .github/workspace/pipeline.yaml
  • Does whatever we want pretty much within the limitations of the github action system.

Cons:

  • Free workers are small, may not be big enough to build Haiku. Custom workers are a thing, but more cost.
  • A lot of people dislike Github in our Community.
  • Github isn't our source of git truth.. it's only a mirror.
  • Pipelines will likely be a bit messy, with lots of duplicated code.

@waddlesplash
Copy link
Member

12 things make up the stack.. including ElasticSearch.

Some of those we can skip or reuse; e.g. we already have a Postgres setup (1 container), I don't think we need Elasticsearch (presumably for log search; it's optional, I hope?) (another 2 containers), and we don't need local builders (another 2 containers.) Some of the others might be avoidable, too, but that's 5 containers of the 12 that we shouldn't need immediately.

@waddlesplash
Copy link
Member

Another to look into: https://man.sr.ht/builds.sr.ht/

@kallisti5
Copy link
Contributor Author

A small note, i've automated the build of some critical containers into Github Actions. (general-worker so far).

https://github.com/haiku/infrastructure/tree/master/containers/general-worker
https://github.com/haiku/infrastructure/blob/master/.github/workflows/general-worker.yaml

Actions are nice for some things since I can build arm64 and amd64 containers. We could do our toolchain container there and offload that work for free (and focus on pipelines / CI/CD for actual Haiku builds only... but we would probably need to move the toolchain-container stuff into the buildtools repository.

@kallisti5
Copy link
Contributor Author

kallisti5 commented Dec 28, 2024

https://review.haiku-os.org/c/buildtools/+/8727 opened to try out building the toolchain containers in github actions until we figure out the rest of our CI/CD

Of note, code pulled from review.haiku-os.org. The mirrored repo @ github is only used for firing off build triggers and the container code.

The fun part is we can build arm64 and amd64 toolchain-worker containers meaning we could build Haiku on arm64 or amd64 hosts. (assuming x86_gcc2 likes being compiled under arm64)

toolchain-automation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants