[CPM] Restoration of cluster fails if it's Infrastructure
resource on the source Seed
was annotated with migration.azure.provider.extensions.gardener.cloud/zone
#827
Labels
area/control-plane-migration
Control plane migration related
kind/bug
Bug
platform/azure
Microsoft Azure platform/infrastructure
How to categorize this issue?
/area control-plane-migration
/kind bug
/platform azure
What happened:
During control plane migration of an HA shoot cluster (using zones
z1
,z2
, andz3
), for which the infrastructure resource is annotated withmigration.azure.provider.extensions.gardener.cloud/zone
, the infrastructure resource is not successfully restored with the following error:Basically, during the
restore
phase of control plane migration for the inrastructure resource theprovider-azure
extension tried to delete the<vnet-name>-nodes
subnet and create<vnet-name>-nodes-z3
. This seems to have happened because the infrastructure resource in the destination seed did not have anmigration.azure.provider.extensions.gardener.cloud/zone: "3"
annotation.How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
The
migration.azure.provider.extensions.gardener.cloud/zone
annotation is put on the infrastructure resource via a mutating webhook here:gardener-extension-provider-azure/pkg/webhook/infrastructure/layout.go
Lines 132 to 141 in b859d7b
In this case, this mutating code did not get executed because of the following:
.status.providerStatus
field is saved in the.status.state.providerStatus
.migrate
phase of CPMgardenlet
takes this.status.state.savedProviderStatus
and saves it in theShootState
restore
phase of CPMgardenlet
creates an infrastructure resource in the destination seed, then it copies the.status.state.savedProviderStatus
from theShootState
and adds it to the infrastructure's.status.state.savedProviderStatuss
field.gardenlet
annotates the the infrastructure resource withgardener.cloud/operation: restore
to trigger restoration.During the updates to the infrastructure resource in 3 and 4 the mutating webhook does not make any changes as it exits early due to these checks:
gardener-extension-provider-azure/pkg/webhook/infrastructure/layout.go
Lines 117 to 130 in b859d7b
Even if the
status.providerState
is patched with the one from thestatus.state.providerState
, the mutating webhook would still not perform any changes because thestatus.providerState
would contain the following:Hence nil is returned here:
gardener-extension-provider-azure/pkg/webhook/infrastructure/layout.go
Lines 128 to 130 in b859d7b
What you expected to happen:
Cluster to be restored successfully.
Environment:
kubectl version
):The text was updated successfully, but these errors were encountered: