Before proceeding, the Descheduler Operator must be installed.
WARNING
Do not run this in a LIVE cluster, this should be dedicated to the specific tests, as it will EVICT running pods every 1 minute when the Pods are older than 5m
.
WARNING
RemoveDuplicates: This strategy makes sure that there is only one pod associated with a ReplicaSet (RS), ReplicationController (RC), StatefulSet, or Job running on the same node. If there are more, those duplicate pods are evicted for better spreading of pods in a cluster. This issue could happen if some nodes went down due to whatever reasons, and pods on them were moved to other nodes leading to more than one pod associated with a RS or RC, for example, running on the same node. Once the failed nodes are ready again, this strategy could be enabled to evict those duplicate pods.
- Update the SoftTopologyAndDuplicates Policy
$ oc apply -n openshift-kube-descheduler-operator -f files/4_SoftTopologyAndDuplicates_RemoveDuplicates.yml
kubedescheduler.operator.openshift.io/cluster created
- Check the configmap to see the Descheduler Policy.
$ oc -n openshift-kube-descheduler-operator get cm cluster -o=yaml
This ConfigMap should show the excluded namespaces and strategies.RemoveDuplicates
is configured and includeSoftConstraints: false
- Check the descheduler cluster
$ oc -n openshift-kube-descheduler-operator logs -l app=descheduler
This log should show a started Descheduler.
- Create a test namespace
$ oc get namespace test || oc create namespace test
namespace/test created
- Cordon one of the workers so we unbalance the number of assigned pods.
a. List the nodes
$ oc get nodes
b. Select a worker node, such as worker-1.rdr-rhop.sslip.io
c. Cordon the node
$ oc adm cordon worker-1.rdr-rhop.sslip.io
node/worker-1.rdr-rhop.sslip.io cordoned
- Create a ReplicaSet
$ oc -n test apply -f files/4_SoftTopologyAndDuplicates_rs.yml
replicaset.apps/ua created
- Check the pods are all on the other node
$ oc -n test get pods -o=custom-columns='Name:metadata.name,NodeName:spec.nodeName' -lapp=ua
Name NodeName
ua-54pmt worker-0.rdr-rhop.sslip.io
ua-8mfx8 worker-0.rdr-rhop.sslip.io
- Uncordon the worker.
$ oc adm uncordon worker-1.rdr-rhop.sslip.io
node/worker-1.rdr-rhop.sslip.io uncordoned
- Check the Logs until we see the processing
$ oc -n openshift-kube-descheduler-operator logs -l app=descheduler --tail=200
I0512 19:35:45.106891 1 duplicates.go:199] "Adjusting feasible nodes" owner={namespace:test kind:ReplicaSet name:unbalanced-6d757874c4 imagesHash:docker.io/ibmcom/pause-ppc64le:3.1} from=5 to=2
I0512 19:35:45.106915 1 duplicates.go:207] "Average occurrence per node" node="worker-1.rdr-rhop.sslip.io" ownerKey={namespace:test kind:ReplicaSet name:ua-8mfx8 imagesHash:docker.io/ibmcom/pause-ppc64le:3.1} avg=1
I0512 19:35:45.126413 1 evictions.go:160] "Evicted pod" pod="test/ua-8mfx8" reason="RemoveDuplicatePods"
I0512 19:35:45.126547 1 descheduler.go:287] "Number of evicted pods" totalEvicted=1
- Check the pods are now redistributed.
$ oc -n test get pods -o=custom-columns='Name:metadata.name,NodeName:spec.nodeName' -lapp=ua
Name NodeName
ua-54pmt worker-1.rdr-rhop.sslip.io
ua-8mfx8 worker-0.rdr-rhop.sslip.io
- Delete the deployment
oc -n test delete replicaset.apps/ua
replicaset.apps "ua" deleted
This Profile shows the SoftTopologyAndDuplicates and how it causes descheduling and scheduling, it's basically a duplicate of TopologyAndDuplicates
.