Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder API calls after successful detach #624

Merged
merged 2 commits into from
Feb 6, 2025

Conversation

jsafrane
Copy link
Contributor

@jsafrane jsafrane commented Jan 31, 2025

What type of PR is this?
/kind bug

What this PR does / why we need it:
Remove a finalizer on a VolumeAttachment (VA) first and then mark its state as detached (if the object still exists).

This ensures that ControllerUnpublish is called only once. It won't be called on VA without the finalizer, even if it has attached: true.

The other way around, i.e. marking the VA as detached first and removing the finalizer later, it can happen that the attacher calls ControllerUnpublish twice. A VA with the finalizer, deletionTimestamp and attached: false is considered as "uncertainly attached" - a previous ControllerPublish might have timed out and the attacher needs to confirm the volume is detached with ControllerUnpublish.

From outer perspective:

  • Before this PR, a detaching VolumeAttachment gets attached: false first and then the finalizer is removed. (And ControllerUnpublish may be called twice).
  • With this PR, there is a high chance that the finalizer is removed and the attachment is deleted in the API server before it's marked as attached: false. A deleted VolumeAttachment should be considered as detached. (And ControllerUnpublish is called once).

Which issue(s) this PR fixes:

Fixes #587

Does this PR introduce a user-facing change?:

Fixed repeated call to ControllerUnpublishVolume by reordering its finalizer removal and marking VolumeAttachment as detached. The VolumeAttachment can now be deleted in the API server before it's marked as `attached: false`.

Remove a finalizer on a VolumeAttachment (VA) first and then mark its state as
detached (if the object still exists).

This ensures that ControllerUnpublish is called only once. It won't be
called on VA without the finalizer, even if it has `attached: true`.

The other way around, i.e. marking the VA as detached first and removing
the finalizer later, it can happen that the attacher calls
ControllerUnpublish twice. A VA with the finalizer, deletionTimestamp and
`attacher: false` is considered as "uncertainly attached" - a previous
ControllerPublish might have timed out and the attacher needs to confirm
the volume is detached with ControllerUnpublish.
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 31, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 31, 2025
@jsafrane jsafrane changed the title Rework order of APi calls after successful detach Rework order of API calls after successful detach Jan 31, 2025
@jsafrane jsafrane changed the title Rework order of API calls after successful detach Reorder API calls after successful detach Jan 31, 2025
@gnufied
Copy link
Contributor

gnufied commented Feb 5, 2025

/assign

Copy link
Contributor

@AndrewSirenko AndrewSirenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

markAsDetached changes look good to me and indeed remove all duplicate ControllerUnpublish calls.

Validated pr by running 100 replica sts scale tests comparing attacher built from this commit vs v4.8.0. (On Karpenter auto-scalar managed nodes)

Before PR, 17 extra ControllerPublish calls are made. After PR no extra calls are made. See full metrics and attacher logs in this gist.

Thanks!

PS: In Feb 5 SIG Triage I mentioned Karpenter may rely on VA resource. On further look Karpenter only relies on existence of VA resource to stall node termination, not va.attached, so this PR is safe for Karpenter.

@gnufied
Copy link
Contributor

gnufied commented Feb 6, 2025

Non-existence of VA objects and attached=false, both should signal the fact that volume is detached and hence we should be fine I think.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 6, 2025
@k8s-ci-robot k8s-ci-robot merged commit 8073914 into kubernetes-csi:master Feb 6, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate ControllerUnpublish calls
4 participants