Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalization of pods not run when CR is deleted #212

Open
aroemen opened this issue Mar 11, 2021 · 17 comments
Open

Finalization of pods not run when CR is deleted #212

aroemen opened this issue Mar 11, 2021 · 17 comments
Labels
bug Something isn't working

Comments

@aroemen
Copy link

aroemen commented Mar 11, 2021

Running kubectl apply -f .\gh-runners-linux.yaml creates the runners as expected in my GitHub organization. When I delete them though (using kubectl delete -f .\gh-runners-linux.yaml), the pods that contained the runners get stuck in a "Terminating" status.

NAMESPACE                        NAME                                              READY   STATUS        RESTARTS   AGE
github-action-runners            runner-pool-pod-fhthp                             0/3     Terminating   0          4m50s
github-action-runners            runner-pool-pod-wfs62                             0/3     Terminating   0          4m50s
github-actions-runner-operator   github-actions-runner-operator-59b9d486b5-t2p62   1/1     Running       0          5m26s

If I edit the pod and remove the finalizer (garo.tietoevry.com/runner-registration), the pod successfully deletes after saving that change. The runner is not being removed from my list of GitHub self hosted runners though as I would expect. Am I missing something here?

@davidkarlsen
Copy link
Collaborator

Then there is a problem with unregistration, please provide logs from the operator to enable me to help you.

@aroemen
Copy link
Author

aroemen commented Mar 11, 2021

@davidkarlsen I don't see any mention of the delete in the operator log. The delete command was issued at 12:34:23 which is the last time there is anything in the operator logs here:

2021-03-11T18:30:34.050Z	INFO	controller-runtime.metrics	metrics server is starting to listen	{"addr": ":8080"}
2021-03-11T18:30:34.051Z	INFO	controller-runtime.injectors-warning	Injectors are deprecated, and will be removed in v0.10.x
2021-03-11T18:30:34.051Z	INFO	controller-runtime.injectors-warning	Injectors are deprecated, and will be removed in v0.10.x
2021-03-11T18:30:34.051Z	INFO	controller-runtime.injectors-warning	Injectors are deprecated, and will be removed in v0.10.x
2021-03-11T18:30:34.051Z	INFO	controller-runtime.injectors-warning	Injectors are deprecated, and will be removed in v0.10.x
2021-03-11T18:30:34.051Z	INFO	controller-runtime.injectors-warning	Injectors are deprecated, and will be removed in v0.10.x
2021-03-11T18:30:34.051Z	INFO	controller-runtime.injectors-warning	Injectors are deprecated, and will be removed in v0.10.x
2021-03-11T18:30:34.051Z	INFO	controller-runtime.injectors-warning	Injectors are deprecated, and will be removed in v0.10.x
2021-03-11T18:30:34.051Z	INFO	controller-runtime.injectors-warning	Injectors are deprecated, and will be removed in v0.10.x
2021-03-11T18:30:34.051Z	INFO	setup	starting manager
I0311 18:30:34.052860       1 leaderelection.go:243] attempting to acquire leader lease github-actions-runner-operator/4ef9cd91.tietoevry.com...
2021-03-11T18:30:34.052Z	INFO	controller-runtime.manager	starting metrics server	{"path": "/metrics"}
I0311 18:30:51.471375       1 leaderelection.go:253] successfully acquired lease github-actions-runner-operator/4ef9cd91.tietoevry.com
2021-03-11T18:30:51.471Z	DEBUG	controller-runtime.manager.events	Normal	{"object": {"kind":"ConfigMap","namespace":"github-actions-runner-operator","name":"4ef9cd91.tietoevry.com","uid":"830a98c7-1d79-4fd4-8b16-27048338c333","apiVersion":"v1","resourceVersion":"156761"}, "reason": "LeaderElection", "message": "github-actions-runner-operator-59b9d486b5-hbsrz_a1bc3d27-328e-490c-86e3-4e6033887fbf became leader"}
2021-03-11T18:30:51.472Z	INFO	controller-runtime.manager.controller.githubactionrunner	Starting EventSource	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2021-03-11T18:30:51.573Z	INFO	controller-runtime.manager.controller.githubactionrunner	Starting EventSource	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2021-03-11T18:30:51.674Z	INFO	controller-runtime.manager.controller.githubactionrunner	Starting EventSource	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "source": "kind source: /, Kind="}
2021-03-11T18:30:51.775Z	INFO	controller-runtime.manager.controller.githubactionrunner	Starting Controller	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner"}
2021-03-11T18:30:51.775Z	INFO	controller-runtime.manager.controller.githubactionrunner	Starting workers	{"reconciler group": "garo.tietoevry.com", "reconciler kind": "GithubActionRunner", "worker count": 1}
2021-03-11T18:30:51.775Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "github-action-runners/runner-pool"}
2021-03-11T18:30:52.172Z	INFO	controllers.GithubActionRunner	Scaling up	{"githubactionrunner": "github-action-runners/runner-pool", "numInstances": 2}
2021-03-11T18:30:52.182Z	INFO	controllers.GithubActionRunner	Creating a new Pod	{"githubactionrunner": "github-action-runners/runner-pool", "Pod.Namespace": "github-action-runners", "Pod.Name": "runner-pool-pod-4ts8j", "result": "created"}
2021-03-11T18:30:52.182Z	DEBUG	controller-runtime.manager.events	Normal	{"object": {"kind":"GithubActionRunner","namespace":"github-action-runners","name":"runner-pool","uid":"377cc688-b76c-4862-b268-3e306e2dc484","apiVersion":"garo.tietoevry.com/v1alpha1","resourceVersion":"156732"}, "reason": "Scaling", "message": "Created pod github-action-runners/runner-pool-pod-4ts8j"}
2021-03-11T18:30:52.186Z	INFO	controllers.GithubActionRunner	Creating a new Pod	{"githubactionrunner": "github-action-runners/runner-pool", "Pod.Namespace": "github-action-runners", "Pod.Name": "runner-pool-pod-779pp", "result": "created"}
2021-03-11T18:30:52.186Z	DEBUG	controller-runtime.manager.events	Normal	{"object": {"kind":"GithubActionRunner","namespace":"github-action-runners","name":"runner-pool","uid":"377cc688-b76c-4862-b268-3e306e2dc484","apiVersion":"garo.tietoevry.com/v1alpha1","resourceVersion":"156732"}, "reason": "Scaling", "message": "Created pod github-action-runners/runner-pool-pod-779pp"}
2021-03-11T18:30:52.256Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "github-action-runners/runner-pool"}
2021-03-11T18:30:52.401Z	INFO	controllers.GithubActionRunner	Pods and runner API not in sync, returning early	{"githubactionrunner": "github-action-runners/runner-pool"}
2021-03-11T18:31:52.256Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "github-action-runners/runner-pool"}
2021-03-11T18:32:52.502Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "github-action-runners/runner-pool"}
2021-03-11T18:33:52.687Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "github-action-runners/runner-pool"}
2021-03-11T18:34:22.734Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "github-action-runners/runner-pool"}
2021-03-11T18:34:23.141Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "github-action-runners/runner-pool"}
2021-03-11T18:34:52.876Z	INFO	controllers.GithubActionRunner	Reconciling GithubActionRunner	{"githubactionrunner": "github-action-runners/runner-pool"}

@aroemen
Copy link
Author

aroemen commented Mar 11, 2021

Sorry, I just noticed I put this on the wrong project. This should probably be on the github-actions-runner-operator project than here. Let me know if you want me to move it.

@davidkarlsen
Copy link
Collaborator

davidkarlsen commented Mar 11, 2021

that's strange, what version are you running of the operator?
can you provide the CR for the runner pool?

@aroemen
Copy link
Author

aroemen commented Mar 11, 2021

I'm running the latest version from helm charts 2.5.10. I'm just testing locally in my k8s environment in docker on win10.

apiVersion: garo.tietoevry.com/v1alpha1
kind: GithubActionRunner
metadata:
  name: runner-pool
  namespace: github-action-runners
spec:
  minRunners: 2                # minimum running pods, required
  maxRunners: 6                # max number of pods, required
  reconciliationPeriod: 1m     # How often it will reconcile, optional, default 1m
  organization: MYORG  # the github org, required
  # repository: "theRepoName"  # if runner for repo, optional
  tokenRef:
    key: GH_TOKEN
    name: actions-runner
  podTemplateSpec:
    metadata:
      annotations:
        "prometheus.io/scrape": "true"
        "prometheus.io/port": "3903"
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                topologyKey: kubernetes.io/hostname
                labelSelector:
                  matchExpressions:
                    - key: garo.tietoevry.com/pool
                      operator: In
                      values:
                        - runner-pool
      containers:
        - name: runner
          env:
            - name: RUNNER_DEBUG
              value: "true"
            - name: DOCKER_TLS_CERTDIR
              value: /certs
            - name: DOCKER_HOST
              value: tcp://localhost:2376
            - name: DOCKER_TLS_VERIFY
              value: "1"
            - name: DOCKER_CERT_PATH
              value: /certs/client
            - name: ACTIONS_RUNNER_INPUT_LABELS
              value: linux,x64
            - name: ACTIONS_RUNNER_INPUT_RUNNERGROUP
              value: "Internal"
            - name: GH_ORG
              value: MYORG
            # if runner for repo:
            # - name: GH_REPO
            #   value: theRepoName
          envFrom:
            - secretRef:
                name: runner-pool-regtoken
          # find the fixed-in-time tags at https://quay.io/repository/evryfs/github-actions-runner?tab=tags if you want to avoid pulling on a moving tag
          # due to https://github.com/actions/runner/issues/246 the runner sw needs to be recent
          # you can subscribe to release-feeds at https://github.com/evryfs/github-actions-runner/releases.atom
          image: quay.io/evryfs/github-actions-runner:latest
          imagePullPolicy: Always
          resources: {}
          volumeMounts:
            - mountPath: /certs
              name: docker-certs
            - mountPath: /home/runner/_diag
              name: runner-diag
            - mountPath: /home/runner/_work
              name: runner-work
            # - mountPath: /home/runner/.m2
            #   name: mvn-repo
            # - mountPath: /home/runner/.m2/settings.xml
            #   name: settings-xml
        - name: docker
          env:
            - name: DOCKER_TLS_CERTDIR
              value: /certs
          image: docker:stable-dind
          imagePullPolicy: Always
          args:
            # See linked issues from: https://github.com/evryfs/github-actions-runner-operator/issues/39
            - --mtu=1430
          resources: {}
          securityContext:
            privileged: true
          volumeMounts:
            - mountPath: /var/lib/docker
              name: docker-storage
            - mountPath: /certs
              name: docker-certs
            - mountPath: /home/runner/_work
              name: runner-work
        - name: exporter
          image: quay.io/evryfs/github-actions-runner-metrics:v0.0.3
          ports:
            - containerPort: 3903
              protocol: TCP
          volumeMounts:
            - name: runner-diag
              mountPath: /_diag
              readOnly: true
      volumes:
        - emptyDir: {}
          name: runner-work
        - emptyDir: {}
          name: runner-diag
        - emptyDir: {}
          name: mvn-repo
        - emptyDir: {}
          name: docker-storage
        - emptyDir: {}
          name: docker-certs
        # - configMap:
        #     defaultMode: 420
        #     name: settings-xml
        #   name: settings-xml

@davidkarlsen davidkarlsen transferred this issue from evryfs/github-actions-runner Mar 11, 2021
@davidkarlsen
Copy link
Collaborator

davidkarlsen commented Mar 11, 2021

I was able to reproduce it. It's an edge case when you delete the actual cr. In this case it's gone and the cleanup step handling the finalization https://github.com/evryfs/github-actions-runner-operator/blob/master/controllers/githubactionrunner_controller.go#L116 is not reached.

GitHub
K8S operator for scheduling github actions runner pods - evryfs/github-actions-runner-operator

@davidkarlsen davidkarlsen added the bug Something isn't working label Mar 11, 2021
@davidkarlsen davidkarlsen changed the title Deleting runners results in a stuck terminated state Finalization of pods not run when CR is deleted Mar 11, 2021
@aroemen
Copy link
Author

aroemen commented Mar 11, 2021

What would be another way to tear down these resources then?

@duyhenryer
Copy link

Hi there,
I have the same issue here

NAME                    READY   STATUS        RESTARTS   AGE
runner-pool-pod-7qhqc   0/3     Terminating   0          4d6h
runner-pool-pod-d96bw   0/3     Terminating   0          4h38m
runner-pool-pod-w278v   0/3     Terminating   0          4h38m
runner-pool-pod-xbmww   0/3     Terminating   0          4h47m

I can't remove them.
Thank you.

@gabriellemadden
Copy link

@aroemen @duyhenryer I was able to delete them by removing the finalizers field. Patch the finalizers list to be null:

kubectl patch pod <POD_NAME> -n <NAMESPACE> -p '{"metadata":{"finalizers":null}}'

@davidkarlsen
Copy link
Collaborator

yes, and that's what the operator does after de-registering them from github - which is why I am curious what the operator logs.

@aroemen
Copy link
Author

aroemen commented Jun 14, 2021

@davidkarlsen I posted the operator logs back in March. Do you need additional data?

@davidkarlsen
Copy link
Collaborator

@aroemen sorry, commented on the wrong issue, I was thinking of #232 which was fixed recently. Still need this to fix this one (deleting CR)

@davidkarlsen
Copy link
Collaborator

@aroemen #264 will solve this, as you can scale the pool to zero, then delete the CR.

@zhsj
Copy link

zhsj commented Jul 12, 2021

Maybe the CR should have finalizer as well.

@tonywildey-valstro
Copy link

I'm trying to make this work on latest build but cant seem to make it...
$ kubectl patch githubactionrunners.garo.tietoevry.com runner-pool --namespace actions-runner --patch '{"spec":{"minRunners":0}}' --type=merge
Results in
The GithubActionRunner "runner-pool" is invalid: spec.minRunners: Invalid value: 0: spec.minRunners in body should be greater than or equal to 1

I suspect that either the image i'm pulling is not the latest - or i'm pulling the image wrong, the operator image i'm pulling using the published helm charts :
helm upgrade --install --wait github-actions-runner-operator evryfs-oss/github-actions-runner-operator --namespace actions-runner-operator --set githubapp.existingSecret=github-runner-app --set githubapp.enabled=true

The runner image is this one :
quay.io/evryfs/github-actions-runner:latest

What am i missing ?

Thx
Tony

@davidkarlsen
Copy link
Collaborator

davidkarlsen commented Dec 30, 2021

@tonywildey-valstro
Copy link

tonywildey-valstro commented Jan 3, 2022

@tonywildey-valstro you probably don't have the lastest crd: https://raw.githubusercontent.com/evryfs/github-actions-runner-operator/v0.10.0/config/crd/bases/garo.tietoevry.com_githubactionrunners.yaml

Ah - there we go - I installed using the helm chart: https://github.com/evryfs/helm-charts/blob/master/charts/github-actions-runner-operator/crds/garo.tietoevry.com_githubactionrunners.yaml which does not have the min runners change

Thx
Tony

GitHub
OpenSourced Helm charts. Contribute to evryfs/helm-charts development by creating an account on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants