Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up pipelines associated with old / merged branches #153

Closed
michaelsauter opened this issue Aug 16, 2021 · 12 comments · Fixed by #429
Closed

Clean up pipelines associated with old / merged branches #153

michaelsauter opened this issue Aug 16, 2021 · 12 comments · Fixed by #429
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers
Milestone

Comments

@michaelsauter
Copy link
Member

No description provided.

@michaelsauter michaelsauter added this to the 0.2.0 milestone Aug 16, 2021
@michaelsauter michaelsauter added enhancement New feature or request good first issue Good for newcomers labels Aug 16, 2021
@michaelsauter
Copy link
Member Author

I'm wondering what approach we should take here.

A few possible approaches:

I am not sure how complicated we want to make it for ODS pipeline. The main goal of pipeline pruning is - at least in my eyes - twofold: to avoid the UI to become too crowded, and to reduce the number of resources in the K8s cluster. With that in mind, keeping n number of pipelines could be a viable option until we need something better. But maybe there is a smart solution that isn't complicated?

Ideas @gerardcl @kuebler @henninggross @renedupont @stitakis?

@renedupont
Copy link
Member

How difficult do you think would it be to mirror the behaviour of the jenkins webook proxy? For me, it makes sense how the jenkins webhook proxy handles it.

IMO just keeping n number does not take into account how fine granular a team creates branches. Team A might create 10 branches a day while Team B only 1 per week. So the preferred n for Team A and B might differ. I know that might be only a very theoretically case, but I know you like nitpicking 😄.

@gerardcl
Copy link
Member

IMHO I would for now just try to keep the list automatically cleaned up and see if it is really required to have specific actions per events like in the jenkins webhook. At the end the goal is to have the list to not be inifinite and basically one is interested on the latest ones and current on active mode...(and if required to recover an old one it is also possible anyhow) I don't see the need on only deleting in case of events indeed (might even not always work as per experiences).

The trick is to me though on deciding what would be a nice default value (as per @renedupont nitpicking escenario :) )

@michaelsauter
Copy link
Member Author

I have a new idea how to approach this. Let's see if you like it :)

The main idea is to keep n branches per environment. All branches mapped to one environment end up in one "bucket". All remaining branches (not mapped to an env) end up in one bucket as well. Then we keep n branches per bucket.

Additionally I would offer a second configuration parameter, a "removal delay". For example, this could be set to 48 hours, and that would have the effect that no branch is cleaned up before 48 hours pass.

Both knobs work well together, should be relative simple to implement and do not require to listen for the webhook events like the Jenkins webhook cleanup. I think it also provides what one wants: access to recent pipeline runs, plus access to the n last pipeline runs that deployed into an env (e.g. last 5 deployments into prod).

@renedupont it addresses your "nitpick" somewhat I believe: since n is not global, many dev branches do not affect pipeline runs that deploy into QA/PROD. It doesn't fully address the different amount of, say, feature branches. But maybe the "removal delay" is good enough for that problem?

Thoughts? :)

@renedupont
Copy link
Member

renedupont commented Nov 30, 2021

Not sure if I understand. How is a branch mapped to an environment? Isn't it often like this:

prod env -> 1 branch (master)
test and/or qa env -> a couple release branches
dev -> dev branch and millions of feature branches

That way only the dev env "bucket" would mainly be of interest. Or do you mean something else when speaking of environment?

@michaelsauter
Copy link
Member Author

@renedupont The cleanup would actually remove PipelineRun resources.

Example:
prod env -> 1 branch (master)
test and/or qa env -> release/*
dev -> *

Also, let's say we keep only 5 pipeline run, and we have the following situation:

  • 6 pipeline runs deploying into prod
  • 5 pipeline runs across release/* branches. E.g. there might be release/1.0.0 (3 pipeline runs), release/2.0.0 (2 pipeline runs) and release/3.0.0 (4 pipeline runs).
  • many pipeline runs for different branches all deploying to dev

When cleanup kicks in, we would remove:

  • the oldest pipeline run into prod (keeping 5 runs overall for PROD)
  • no pipeline run for release/3.0.0, 1 pipeline run for release/2.0.0, and all pipeline runs, including the pipeline itself, for release/1.0.0 (keeping 5 runs overall for QA)
  • and similar for dev: keep the 5 latest pipeline runs, and delete the rest. All pipelines which end up with 0 runs after this, get removed, keeping 5 runs overall for DEV.

Does that help?

@michaelsauter
Copy link
Member Author

@ManuelFeller ManuelFeller self-assigned this Jan 13, 2022
@gerardcl gerardcl self-assigned this Jan 13, 2022
@michaelsauter
Copy link
Member Author

One issue I realized with my proposal in the meantime is that it depends on the ods.yaml within a branch. Which means that if you modify the branch mapping in one branch, you automatically trigger cleanup based on that ... which may not be what we want.

@michaelsauter
Copy link
Member Author

michaelsauter commented Jan 25, 2022

I've had some more thoughts on this as it is in the critical path towards resolving #394.

I do not consider my proposal above ideal anymore, as cleanup would depend on state in a branch, which might lead to unwanted side effects, plus the logic wasn't as simple as I'd like it to be.

I have since realized that all pipeline runs target a certain stage, which is either dev, qa or prod. Even branches not mapped to an environment have a stage set (dev). We can use this to simplify the proposal, and just keep n pipeline runs per stage. For this to work, we need to apply a new label to each pipeline run which identifies the stage (e.g. org.opendevstack.pipeline/stage=qa). When cleanup is triggered, we simply remove the oldest pipeline runs per stage until we have only n pipeline runs left.

In addition, I would also keep the "removal delay" described above (so that all pipeline runs are there for at least 48 hours or so).

This strategy should lead to rather aggressive cleanup for the dev stage (which should contain pipeline runs for feature branches and integrated branches) and quite conservative cleanup for qa/prod, which is inline with what I'd like to achieve.

@ManuelFeller @gerardcl You've assigned this issue to yourself. Could you review my proposal? Also - not sure if you intended to implement it. If so, please let me know if you need pointers to get started and when you would be able to get to it. Otherwise I would take this over since I think it is best to solve the issue here before working in #394, and that issue needs to be dealt with soon.

@gerardcl
Copy link
Member

hi @michaelsauter , we did not met yet so we did not check proposals so far.
I see your point and I agree with latest proposal, so far looks good.
Since I will focus on other topics in the short-term feel free to jump to it so we do not block this. Also, feel free to pull me in when you jump into developing this, I am happy to be an extra two eyes 👀

@ManuelFeller
Copy link
Member

ManuelFeller commented Jan 25, 2022

Hi @michaelsauter. I see it pretty similar if we do not want to make things too complex. The approach with "have n runs per environment but keep them at least 48 hours" sounds like a good and not too complex solution. 👍

@michaelsauter
Copy link
Member Author

@gerardcl @ManuelFeller Thanks for the feedback! I am going to assign this to myself now ...

michaelsauter added a commit that referenced this issue Feb 2, 2022
michaelsauter added a commit that referenced this issue Feb 2, 2022
michaelsauter added a commit that referenced this issue Feb 8, 2022
michaelsauter added a commit that referenced this issue Feb 8, 2022
michaelsauter added a commit that referenced this issue Feb 8, 2022
michaelsauter added a commit that referenced this issue Feb 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants