You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Custom self hosted cluster running on Ubuntu 22.04.2 LTS nodes
Describe the bug
As soon as I install vSphere CPI (102.0.0+up1.4.2) on a new cluster and reboot any worker or control-plane node the node will be deleted and is unable to rejoin the cluster.
Corresponding rke2-server log: May 03 15:03:54 testserver01 rke2[857]: time="2023-05-03T15:03:54Z" level=error msg="error syncing 'testagent03': handler node: Operation cannot be fulfilled on nodes \"testagent03\": StorageError: invalid object, Code: 4, Key: /registry/minions/testagent03, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: efd660aa-50e2-4f9a-8483-64f43cd2204e, UID in object meta: , requeuing"
After deinstalling vSphere CPI (102.0.0+up1.4.2), the node rejoins again without issues.
Log:
May 03 15:07:28 testserver02 rke2[857]: time="2023-05-03T15:07:28Z" level=info msg="certificate CN=testagent03 signed by CN=rke2-server-ca@1683122526: notBefore=2023-05-03 14:02:06 +0000 UTC notAfter=2024-05-02 15:07:28 +0000 UTC"
May 03 15:07:28 testserver02 rke2[857]: time="2023-05-03T15:07:28Z" level=info msg="certificate CN=system:node:testagent03,O=system:nodes signed by CN=rke2-client-ca@1683122526: notBefore=2023-05-03 14:02:06 +0000 UTC notAfter=2024-05-02 15:07:28 +0000 UTC"
May 03 15:07:31 testserver02 rke2[857]: time="2023-05-03T15:07:31Z" level=info msg="Handling backend connection request [testagent03]"
To Reproduce
Provision new rke2 cluster
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
testagent01 Ready <none> 72m v1.25.9+rke2r1
testagent02 Ready <none> 72m v1.25.9+rke2r1
testagent03 Ready <none> 10m v1.25.9+rke2r1
testserver01 Ready control-plane,etcd,master 74m v1.25.9+rke2r1
testserver02 Ready control-plane,etcd,master 72m v1.25.9+rke2r1
testserver03 Ready control-plane,etcd,master 73m v1.25.9+rke2r1
Install vSphere CPI (102.0.0+up1.4.2) via Rancher UI with "Define vSphere Tags" option. See settings in screenshot.
Reboot any cluster node.
Check rke2-server logs.
Result
Expected Result
Nodes should not be removed when rebooting.
Screenshots
Additional context
All nodes are provisioned with the vSphere parameter "disk.enableUUID=TRUE" set before installing rke2.
The text was updated successfully, but these errors were encountered:
vSphere Server Info
Rancher Server Setup
Information about the Cluster
Custom self hosted cluster running on Ubuntu 22.04.2 LTS nodes
Describe the bug
As soon as I install vSphere CPI (102.0.0+up1.4.2) on a new cluster and reboot any worker or control-plane node the node will be deleted and is unable to rejoin the cluster.
Corresponding rke2-server log:
May 03 15:03:54 testserver01 rke2[857]: time="2023-05-03T15:03:54Z" level=error msg="error syncing 'testagent03': handler node: Operation cannot be fulfilled on nodes \"testagent03\": StorageError: invalid object, Code: 4, Key: /registry/minions/testagent03, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: efd660aa-50e2-4f9a-8483-64f43cd2204e, UID in object meta: , requeuing"
After deinstalling vSphere CPI (102.0.0+up1.4.2), the node rejoins again without issues.
Log:
To Reproduce
Provision new rke2 cluster
Install vSphere CPI (102.0.0+up1.4.2) via Rancher UI with "Define vSphere Tags" option. See settings in screenshot.
Reboot any cluster node.
Check rke2-server logs.
Result
Expected Result
Nodes should not be removed when rebooting.
Screenshots
data:image/s3,"s3://crabby-images/b7b86/b7b86b3d918829e59e35d45bd177e76be83aa74b" alt="image"
Additional context
All nodes are provisioned with the vSphere parameter "disk.enableUUID=TRUE" set before installing rke2.
The text was updated successfully, but these errors were encountered: