-
Notifications
You must be signed in to change notification settings - Fork 236
WIP certrotation: don't let multiple controllers rewrite metadata of the shared resource #1693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP certrotation: don't let multiple controllers rewrite metadata of the shared resource #1693
Conversation
/hold Testing this in openshift/cluster-kube-apiserver-operator#1655 |
/hold cancel Seems to work as expected |
@@ -64,10 +64,11 @@ func (c CABundleConfigMap) EnsureConfigMapCABundle(ctx context.Context, signingC | |||
} | |||
needsMetadataUpdate = c.AdditionalAnnotations.EnsureTLSMetadataUpdate(&caBundleConfigMap.ObjectMeta) || needsMetadataUpdate | |||
if needsMetadataUpdate && len(caBundleConfigMap.ResourceVersion) > 0 { | |||
_, _, err := resourceapply.ApplyConfigMap(ctx, c.Client, c.EventRecorder, caBundleConfigMap) | |||
actualCABundleConfigMap, _, err := resourceapply.ApplyConfigMap(ctx, c.Client, c.EventRecorder, caBundleConfigMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mind adding a unit test ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, not sure if it really ensures hotlooping is not happening (imo fuzzing would confirm it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR doesn't solve the hotloop issue. The hotloop issue was resolved by openshift/cluster-kube-apiserver-operator#1661.
There were 4
controllers fighting over kube-control-plane-signer-ca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this by allowing metadata change to the first controller only (whoever has set the first ownership ref).
I'll run this a few time to make sure this won't make TLS metadata jitter from run to run
In the future we could refactor the controller to only issue a single update when meta and content require updating. |
7b0793e
to
6df5e95
Compare
@vrutkovs a wrong update ? I don't see a unit test. |
6df5e95
to
9f8acda
Compare
This prevents http 409 errors when both labels and content need updating in one sync
9f8acda
to
079ddac
Compare
For the record, the above failure was due to openshift/cluster-kube-apiserver-operator#1661 not due to the issue addressed by this PR. |
if !actions[2].Matches("get", "configmaps") { | ||
t.Error(actions[2]) | ||
} | ||
if !actions[3].Matches("update", "configmaps") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the second call is unnecessary, we should refactor the controller to issue only a single API call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in addition it is worth to also examine the payload of the request (updated configmap), please add it to your todo list.
/lgtm |
/assign @stlaz for approval |
New changes are detected. LGTM label has been removed. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: p0lyn0mial, vrutkovs The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
caBundleConfigMap
after labels have been appliedEnsure that multiple controllers managing single resource don't overwrite each other metadata changes. This is archived by checking ownership references - if current controller is set as first owner other metadata changes are being ignored. This ensures that multiple cert rotation controllers can handle single CA bundle configmap - but only one of them will be updating configmap metadata - whichever is started first
db12f27
to
21aeec1
Compare
…y metadata In this case cert-rotation won't convert secret type as `ensureSecretTLSTypeSet` would never run
@vrutkovs: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
needsMetadataUpdate := false | ||
if c.Owner != nil { | ||
needsMetadataUpdate = ensureOwnerReference(&caBundleConfigMap.ObjectMeta, c.Owner) | ||
if caBundleConfigMap.ObjectMeta.OwnerReferences[0] == *c.Owner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we just have a single owner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CA bundle can contain multiple signers, controlled by different controllers. We could have a single ownerRef, but that wouldn't reflect reality. It also makes easier to find such CA bundles in the must-gathers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CA bundle can contain multiple signers, controlled by different controllers
This is wrong and should be fixed. Why can't we have a single signer ?
Could you point me to code that does this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two controllers:
InternalLoadBalancerServing
ExternalLoadBalancerServing
are managing single signer -loadbalancer-serving-signer
and CA bundleloadbalancer-serving-ca
(in order to produce two target certs). As a result, both are added as owners to the signer secret and configmap.
I'm planning to rewrite this as "NewCertRotationController should accept an array of RotatedSelfSignedCertKeySecret
s" later to avoid it, but in general it may happen accidentally and we'd need a way to prevent that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, there is a single signer and a CA bundle.
Also as far as I can tell kas-o
doesn't make use of the Owner
field for both signer and CA resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Owner denotes a controller, not the signer or CA. So yes, these are derived from single signer - but controlled by different controllers.
Do you think a better option would be implementing a controller which controls multiple target certs instead?
UPD: created #1722 to try this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we have to have multiple controllers controlling a single resource. I would prefer to have a single controller for that.
It seems to me that we would like to mark which components are using the resource, which could be done with a single controller.
Running one controller for a resource would fix this and few more problems, closing in favor of openshift/cluster-kube-apiserver-operator#1669 |
This prevents http 409 errors when both labels and content need updating in one sync. See this test failure when new library-go is being pulled in cluster-kube-apiserver-operator