Don't track attachments of NFS volumes #937

timebertt · 2024-08-23T10:31:12Z

/kind bug

What this PR does / why we need it:

This PR filters out NFS volumes when draining nodes, so that MCM doesn't try to track their detachment and re-attachment.
There is no such thing as an attachment to a node for NFS volumes. NFS volumes only need to be mounted (by kubelet).

Accordingly, MCM always runs into the configured PV reattach timeout, if there are pods with NFS PVCs on the to-be-drained node.

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

cc @xoxys

Release note:

A bug has been fixed for draining nodes with mounted NFS volumes. With this fix, machine-controller-manager doesn't try to track their (non-existing) VolumeAttachments.

gardener-robot · 2024-08-23T10:31:32Z

@timebertt Thank you for your contribution.

gardener-robot-ci-3 · 2024-08-23T10:31:40Z

Thank you @timebertt for your contribution. Before I can start building your PR, a member of the organization must set the required label(s) {'reviewed/ok-to-test'}. Once started, you can check the build status in the PR checks section below.

timebertt · 2024-08-23T10:33:26Z

We probably need to add tests for this PR. However, we wanted to open it to get some feedback on whether this is the correct approach/place to solve the issue.

kon-angelo · 2024-08-28T13:20:52Z

@timebertt do you think, we should generalise this to also skip the check in case the CSIDriver is configured with attachRequired == false ? (Not in this PR even)

timebertt · 2024-08-28T14:35:58Z

@timebertt do you think, we should generalise this to also skip the check in case the CSIDriver is configured with attachRequired == false ? (Not in this PR even)

Sounds good. I had the impression that the attachment tracking doesn't work at all if there are no VolumeAttachment objects, i.e., no CSI driver is used, and the operation just runs into the configured timeout. Is this correct?
If so, we should probably change this PR from "skip tracking NFS volumes" to "skip tracking non-CSI volumes".

Another PR could then enhance the check for CSI drivers where no VolumeAttachments are used.

timebertt · 2024-09-03T07:27:27Z

@kon-angelo @gardener/mcm-maintainers can you provide feedback on the PR and the above discussion?

unmarshall · 2024-09-03T09:15:10Z

@timebertt do you think, we should generalise this to also skip the check in case the CSIDriver is configured with attachRequired == false ? (Not in this PR even)

I agree to @kon-angelo's observation that this can be generically handled. @timebertt since you have raised this PR we can also do this in 2 steps. We can skip NFS volumes (as you have done in this PR) and then later we can generalise this. Since generalisation would required additional lookup of the CSIDriver object and looking up its AttachRequired property to decide if wait for detachment for such a PV can be skipped.

kon-angelo · 2024-09-03T10:27:26Z

Aside from my previous point, I kind of understand @timebertt 's suggestion. The MCM code relies on volumeattachments which also to my knowledge is only created for CSI volumes. We could probably update the logic in this PR to skip anything other than CSI volumes. (but my experience is just going through the code and this can be verified from one of the actual mcm maintainers).

Though one point is that I am not particularly fond that we "skip" reporting the existence of volumes, rather than not accounting them when draining - if it makes sense. Like this can be somewhat troubling to debug I find. But the actual function doing the work evictPodsWithPVInternal that also uses the getVolIDsFromDriver is used in many places and so I understand this implementation.

I just feel that at some point we will go down a debugging rabbit hole because getVolIDsFromDriver also does a little more extra than the name suggests which is the skipping part 🤷

Don't track attachments of NFS volumes

4566b54

timebertt requested a review from a team as a code owner August 23, 2024 10:31

gardener-robot added needs/review Needs review kind/bug Bug labels Aug 23, 2024

gardener-robot added the size/xs Size of pull request is tiny (see gardener-robot robot/bots/size.py) label Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't track attachments of NFS volumes #937

Don't track attachments of NFS volumes #937

timebertt commented Aug 23, 2024

gardener-robot commented Aug 23, 2024

gardener-robot-ci-3 commented Aug 23, 2024

timebertt commented Aug 23, 2024

kon-angelo commented Aug 28, 2024

timebertt commented Aug 28, 2024

timebertt commented Sep 3, 2024

unmarshall commented Sep 3, 2024 •

edited

Loading

kon-angelo commented Sep 3, 2024 •

edited

Loading

Don't track attachments of NFS volumes #937

Are you sure you want to change the base?

Don't track attachments of NFS volumes #937

Conversation

timebertt commented Aug 23, 2024

gardener-robot commented Aug 23, 2024

gardener-robot-ci-3 commented Aug 23, 2024

timebertt commented Aug 23, 2024

kon-angelo commented Aug 28, 2024

timebertt commented Aug 28, 2024

timebertt commented Sep 3, 2024

unmarshall commented Sep 3, 2024 • edited Loading

kon-angelo commented Sep 3, 2024 • edited Loading

unmarshall commented Sep 3, 2024 •

edited

Loading

kon-angelo commented Sep 3, 2024 •

edited

Loading