Skip to content
This repository was archived by the owner on Oct 22, 2024. It is now read-only.
This repository was archived by the owner on Oct 22, 2024. It is now read-only.

volume leak during deployment rollout #733

Closed
@pohly

Description

@pohly

The controller is designed such that it collects information about volumes from nodes as the nodes register themselves. This implies that the controller cannot know about existing volumes for nodes that haven't registered (yet).

This leads to the following problem:

  • DeleteVolume is called for an existing volume that the controller doesn't know about at the moment.
  • The controller cannot distinguish between "volume already deleted" (idempotency!) and "need to wait for some node with that volume".
  • It assumes that the volume is gone and returns success without doing anything, after a misleading log message about "Volume pvc-bd-adc62b1395a868c243a74ee138e313a19c72211c5fbc0d5f2706e486 not created by this controller".
  • external-provisioner removes PV

=> volume leak

This problem was triggered by the new version skew tests which restart the driver while volumes exist, then does some operations (including removal) with them right after the driver deployment comes up again.

Metadata

Metadata

Assignees

Labels

0.9bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions