Skip to content
This repository has been archived by the owner on Oct 22, 2024. It is now read-only.

Commit

Permalink
deploy: reschedule PVCs on nodes with no PMEM-CSI driver
Browse files Browse the repository at this point in the history
When a PVC gets assigned to a node which has no PMEM-CSI running,
whether it is because scheduler extensions are not enabled or there
was a race while changing where to run the driver, then the new
"descheduler" (= a stripped down external provisioner) unsets the
"selected node" annotation.

For the user this looks like this:

Events:
  Type     Reason                 Age              From                                                                                     Message
  ----     ------                 ----             ----                                                                                     -------
  Normal   ExternalProvisioning   2s (x2 over 2s)  persistentvolume-controller                                                              waiting for a volume to be created, either by external provisioner "pmem-csi.intel.com" or manually created by system administrator
  Normal   Provisioning           2s               pmem-csi.intel.com_pmem-csi-intel-com-controller-0_9e16d74d-c645-4478-9f0d-50db58a962ce  External provisioner is provisioning volume for claim "latebinding-7887/pvc-g7p97"
  Warning  ProvisioningFailed     2s               pmem-csi.intel.com_pmem-csi-intel-com-controller-0_9e16d74d-c645-4478-9f0d-50db58a962ce  failed to provision volume with StorageClass "pmem-pmem-csi-sc-ext4-latebinding-7887": reschedule PVC latebinding-7887/pvc-g7p97 because it is assigned to node pmem-csi-pmem-govm-master which has no PMEM-CSI driver
  Normal   WaitForPodScheduled    2s               persistentvolume-controller                                                              waiting for pod pvc-volume-tester-writer-latebinding-4pkpv to be scheduled
  Normal   Provisioning           2s               pmem-csi.intel.com_pmem-csi-intel-com-node-4gblz_23b46c1d-b5c3-4cfe-a2f9-b22c77e666c6    External provisioner is provisioning volume for claim "latebinding-7887/pvc-g7p97"
  Normal   ProvisioningSucceeded  2s               pmem-csi.intel.com_pmem-csi-intel-com-node-4gblz_23b46c1d-b5c3-4cfe-a2f9-b22c77e666c6    Successfully provisioned volume pvc-66a4a486-5782-4cf2-8b13-f13f66e30e19

For the sake of simplicity, RBAC rules and resource allocation are
based on what they would have to be when running both webhook and
rescheduler.
  • Loading branch information
pohly committed Jan 20, 2021
1 parent 7c4ee86 commit 2b24963
Show file tree
Hide file tree
Showing 55 changed files with 1,572 additions and 209 deletions.
32 changes: 16 additions & 16 deletions deploy/bindata_generated.go

Large diffs are not rendered by default.

5 changes: 0 additions & 5 deletions deploy/common/pmem-storageclass-cache.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,5 @@
# Generated with "make kustomize", do not edit!

allowedTopologies:
- matchLabelExpressions:
- key: storage
values:
- pmem
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
Expand Down
5 changes: 0 additions & 5 deletions deploy/common/pmem-storageclass-default.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,5 @@
# Generated with "make kustomize", do not edit!

allowedTopologies:
- matchLabelExpressions:
- key: storage
values:
- pmem
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
Expand Down
5 changes: 0 additions & 5 deletions deploy/common/pmem-storageclass-ext4.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,5 @@
# Generated with "make kustomize", do not edit!

allowedTopologies:
- matchLabelExpressions:
- key: storage
values:
- pmem
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
Expand Down
5 changes: 0 additions & 5 deletions deploy/common/pmem-storageclass-late-binding.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,5 @@
# Generated with "make kustomize", do not edit!

allowedTopologies:
- matchLabelExpressions:
- key: storage
values:
- pmem
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
Expand Down
5 changes: 0 additions & 5 deletions deploy/common/pmem-storageclass-xfs.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,5 @@
# Generated with "make kustomize", do not edit!

allowedTopologies:
- matchLabelExpressions:
- key: storage
values:
- pmem
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
Expand Down
26 changes: 25 additions & 1 deletion deploy/kubernetes-1.17/direct/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,15 @@ metadata:
pmem-csi.intel.com/deployment: direct-production
name: pmem-csi-intel-com-webhooks-runner
rules:
- apiGroups:
- ""
resources:
- persistentvolumes
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
Expand All @@ -178,10 +187,24 @@ rules:
- get
- list
- watch
- patch
- update
- apiGroups:
- ""
resources:
- events
verbs:
- get
- list
- watch
- patch
- update
- create
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- csinodes
verbs:
- get
- list
Expand Down Expand Up @@ -319,11 +342,12 @@ spec:
- -v=3
- -logging-format=text
- -mode=webhooks
- -schedulerListen=:8000
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -nodeSelector={"storage":"pmem"}
- -caFile=/certs/ca.crt
- -certFile=/certs/tls.crt
- -keyFile=/certs/tls.key
- -schedulerListen=:8000
- -metricsListen=:10010
env:
- name: TERMINATION_LOG_PATH
Expand Down
26 changes: 25 additions & 1 deletion deploy/kubernetes-1.17/direct/testing/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,15 @@ metadata:
pmem-csi.intel.com/deployment: direct-testing
name: pmem-csi-intel-com-webhooks-runner
rules:
- apiGroups:
- ""
resources:
- persistentvolumes
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
Expand All @@ -178,10 +187,24 @@ rules:
- get
- list
- watch
- patch
- update
- apiGroups:
- ""
resources:
- events
verbs:
- get
- list
- watch
- patch
- update
- create
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- csinodes
verbs:
- get
- list
Expand Down Expand Up @@ -335,11 +358,12 @@ spec:
- -v=3
- -logging-format=text
- -mode=webhooks
- -schedulerListen=:8000
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -nodeSelector={"storage":"pmem"}
- -caFile=/certs/ca.crt
- -certFile=/certs/tls.crt
- -keyFile=/certs/tls.key
- -schedulerListen=:8000
- -metricsListen=:10010
- -v=5
- -coverprofile=/var/lib/pmem-csi-coverage/pmem-csi-driver-controller-*.out
Expand Down
26 changes: 25 additions & 1 deletion deploy/kubernetes-1.17/lvm/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,15 @@ metadata:
pmem-csi.intel.com/deployment: lvm-production
name: pmem-csi-intel-com-webhooks-runner
rules:
- apiGroups:
- ""
resources:
- persistentvolumes
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
Expand All @@ -178,10 +187,24 @@ rules:
- get
- list
- watch
- patch
- update
- apiGroups:
- ""
resources:
- events
verbs:
- get
- list
- watch
- patch
- update
- create
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- csinodes
verbs:
- get
- list
Expand Down Expand Up @@ -319,11 +342,12 @@ spec:
- -v=3
- -logging-format=text
- -mode=webhooks
- -schedulerListen=:8000
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -nodeSelector={"storage":"pmem"}
- -caFile=/certs/ca.crt
- -certFile=/certs/tls.crt
- -keyFile=/certs/tls.key
- -schedulerListen=:8000
- -metricsListen=:10010
env:
- name: TERMINATION_LOG_PATH
Expand Down
26 changes: 25 additions & 1 deletion deploy/kubernetes-1.17/lvm/testing/pmem-csi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,15 @@ metadata:
pmem-csi.intel.com/deployment: lvm-testing
name: pmem-csi-intel-com-webhooks-runner
rules:
- apiGroups:
- ""
resources:
- persistentvolumes
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
Expand All @@ -178,10 +187,24 @@ rules:
- get
- list
- watch
- patch
- update
- apiGroups:
- ""
resources:
- events
verbs:
- get
- list
- watch
- patch
- update
- create
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- csinodes
verbs:
- get
- list
Expand Down Expand Up @@ -335,11 +358,12 @@ spec:
- -v=3
- -logging-format=text
- -mode=webhooks
- -schedulerListen=:8000
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -nodeSelector={"storage":"pmem"}
- -caFile=/certs/ca.crt
- -certFile=/certs/tls.crt
- -keyFile=/certs/tls.key
- -schedulerListen=:8000
- -metricsListen=:10010
- -v=5
- -coverprofile=/var/lib/pmem-csi-coverage/pmem-csi-driver-controller-*.out
Expand Down
26 changes: 25 additions & 1 deletion deploy/kubernetes-1.17/pmem-csi-direct-testing.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,15 @@ metadata:
pmem-csi.intel.com/deployment: direct-testing
name: pmem-csi-intel-com-webhooks-runner
rules:
- apiGroups:
- ""
resources:
- persistentvolumes
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
Expand All @@ -178,10 +187,24 @@ rules:
- get
- list
- watch
- patch
- update
- apiGroups:
- ""
resources:
- events
verbs:
- get
- list
- watch
- patch
- update
- create
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- csinodes
verbs:
- get
- list
Expand Down Expand Up @@ -335,11 +358,12 @@ spec:
- -v=3
- -logging-format=text
- -mode=webhooks
- -schedulerListen=:8000
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -nodeSelector={"storage":"pmem"}
- -caFile=/certs/ca.crt
- -certFile=/certs/tls.crt
- -keyFile=/certs/tls.key
- -schedulerListen=:8000
- -metricsListen=:10010
- -v=5
- -coverprofile=/var/lib/pmem-csi-coverage/pmem-csi-driver-controller-*.out
Expand Down
26 changes: 25 additions & 1 deletion deploy/kubernetes-1.17/pmem-csi-direct.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,15 @@ metadata:
pmem-csi.intel.com/deployment: direct-production
name: pmem-csi-intel-com-webhooks-runner
rules:
- apiGroups:
- ""
resources:
- persistentvolumes
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
Expand All @@ -178,10 +187,24 @@ rules:
- get
- list
- watch
- patch
- update
- apiGroups:
- ""
resources:
- events
verbs:
- get
- list
- watch
- patch
- update
- create
- apiGroups:
- storage.k8s.io
resources:
- storageclasses
- csinodes
verbs:
- get
- list
Expand Down Expand Up @@ -319,11 +342,12 @@ spec:
- -v=3
- -logging-format=text
- -mode=webhooks
- -schedulerListen=:8000
- -drivername=$(PMEM_CSI_DRIVER_NAME)
- -nodeSelector={"storage":"pmem"}
- -caFile=/certs/ca.crt
- -certFile=/certs/tls.crt
- -keyFile=/certs/tls.key
- -schedulerListen=:8000
- -metricsListen=:10010
env:
- name: TERMINATION_LOG_PATH
Expand Down
Loading

0 comments on commit 2b24963

Please sign in to comment.