Proposal: Provide publish context during ControllerUnpublishVolume #496

sumukhs · 2021-11-05T20:36:30Z

Problem Overview:

ControllerPublishVolume has a contract to return FAILED_PRECONDITION with the status indicating the node id that the volume is currently published on.
This helps detecting stale ControllerPublishVolume API calls by the CO to the SP controller and take appropriate action.

However, during the controller unpublish, consider the following ordering of events:

CO issues a ControllerUnpublishVolume(node1) to the Plugin.
The above TCP message is delayed indefinitely over the network.
Node on which CO is running experiences a network partition and is not part of a federated cluster anymore. As a result, the CO is now running in a different node that decides to place it on node2. NOTE that the old TCP connection from the CO is still alive to the Plugin.
CO issues ControllerPublishVolume(node2) to the Plugin.
CO then decides to place the volume on node1 again hence issues ControllerUnpublishVolume(node2) subsequently doing a ControllerPublishVolume(node1).
Finally, the delayed TCP message from 2 arrives indicating a ControllerUnpublishVolume(node1)

Expectation:

The Plugin must be able to identify the stale API call that comes in step 6 and ignore it.

Proposal:

If the ControllerUnpublishVolume includes the original publish context that was returned as part of the ControllerPublishVolume, the Plugin/SP can sequence these and detect staless.

jdef · 2021-11-05T22:34:28Z

The controller-unpublish call does not relay state on purpose. Consider that the CO is braindead and loses context. It should still be possible to unpublish. I'm sympathetic to the partitioning problem, and think that the CO should do more to help plugins deal with such.

…

-1 on this proposal.

On Fri, Nov 5, 2021, 4:36 PM Sumukh Shivaprakash ***@***.***> wrote: Problem Overview: ControllerPublishVolume has a contract to return FAILED_PRECONDITION with the status indicating the node id that the volume is currently published on. This helps detecting stale ControllerPublishVolume API calls by the CO to the SP controller and take appropriate action. However, during the controller unpublish, consider the following ordering of events: 1. CO issues a ControllerUnpublishVolume(node1) to the Plugin. 2. The above TCP message is delayed indefinitely over the network. 3. Node on which CO is running experiences a network partition and is not part of a federated cluster anymore. As a result, the CO is now running in a different node that decides to place it on node2. NOTE that the old TCP connection from the CO is still alive to the Plugin. 4. CO issues ControllerPublishVolume(node2) to the Plugin. 5. CO then decides to place the volume on node1 again hence issues ControllerUnpublishVolume(node2) subsequently doing a ControllerPublishVolume(node1). 6. Finally, the delayed TCP message from 2 arrives indicating a ControllerUnpublishVolume(node1) Expectation: The Plugin must be able to identify the stale API call that comes in step 6 and ignore it. Proposal: If the ControllerUnpublishVolume includes the original publish context that was returned as part of the ControllerPublishVolume, the Plugin/SP can sequence these and detect staless. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#496>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR5KLBTVPDVDLJRE6OI2UDUKRE6ZANCNFSM5HOU3QLQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

sumukhs · 2021-11-05T22:40:17Z

Isn't the co storing/plumbing around the volume id and volume context already?

I didn't get how that is different from the publish context.

If the co is brain dead and loses all its context, how would it be able to controller unpublish anyway?

jdef · 2021-11-05T22:51:56Z

The only required field for controller-unpublish is volume_id.

…

On Fri, Nov 5, 2021, 6:40 PM Sumukh Shivaprakash ***@***.***> wrote: Isn't the co storing/plumbing around the volume id and volume context already? I didn't get how that is different from the publish context. If the co is brain dead and loses all its context, how would it be able to controller unpublish anyway? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#496 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR5KLDD3TAGUYWV4B3SERDUKRTO3ANCNFSM5HOU3QLQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

sumukhs · 2021-11-05T22:54:44Z

Isn't the assumption that the volume_id is also lost though?

CO would still have to go through the flow of all the idempotent calls like CreateVolume (which gives back the volume_id), then ControllerPublishVolume(which being idempotent can give back the publish context again) and finally invoke the ControllerUnpublishVolume(publish_context)?

sumukhs · 2021-11-05T23:02:10Z

Unless you are referring to manual intervention by a human who is trying to patch this up and wants to unpublish the volume - I see your point then.

jdef · 2021-11-05T23:04:01Z

Correct. A human should be able to provide volume ID.

…

On Fri, Nov 5, 2021, 7:02 PM Sumukh Shivaprakash ***@***.***> wrote: Unless you are referring to manual intervention by a human who is trying to patch this up and wants to unpublish the volume - I see your point then. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#496 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR5KLG27DPOONBTNINNE2TUKRWAZANCNFSM5HOU3QLQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

sumukhs · 2021-11-05T23:31:26Z

That makes sense. Any suggestions on how this can be tackled or pointers to other approaches are appreciated!

Going back to your original response, what did you imply by "CO must do more". A CO doing anything more wouldn't be csi conformant and hence lose the flexibility of plugging in different SP's right?

jdef · 2021-11-05T23:55:31Z

CO must do more: There's another csi proposal that deals with partitioning and attempts to implement a fencing strategy within csi - I commented there as well. The CO bears more responsibility (for complexity) in such cases so that plug-in implementations (and the csi spec) remain simpler. It sounds like you're trying to address an important problem with a minimal spec change, and in general I'm supportive of such surgical approaches. However, in this case, the proposal violates a principal that we established early on re: braindead CO and volume cleanup operations.

…

On Fri, Nov 5, 2021, 7:31 PM Sumukh Shivaprakash ***@***.***> wrote: That makes sense. Any suggestions on how this can be tackled or pointers to other approaches are appreciated! Going back to your original response, what did you imply by "CO must do more". A CO doing anything more wouldn't be csi conformant and hence lose the flexibility of plugging in different SP's right? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#496 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR5KLELQS2E7WLPF756JF3UKRZORANCNFSM5HOU3QLQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

sumukhs · 2021-11-06T03:37:57Z

Yes, for the apis in csi that I cared about (create,controller publish/unpublish, node stage/unstage, node publish/unpublish, delete), it seems like we can deal with a network partition for all the cases except the controller unpublish and hence the question here.

I can take a look at the other proposal to see how the co could do more to handle this situation - can you point me to the issue number?

The way I think of a co, in spite of all its complexity and how many ever intermediate states it persists, a network partition causing a stale api call to arrive at a plugin during unpublish will not be distinguished because the spec only allows for a volume id.

I can think of passing in an optional "etag" that if absent will be treated as the safety net/human intervention where the call will be honored and the unpublish will be accepted.

However, if the value is set to something (which will be the case in the stale api call), it is ignored due to the etag checks. It is similar to database optimistic concurrency. Would this address the drawback of the principles of csi you pointed out in my original proposal?

sumukhs · 2021-11-09T05:27:17Z

@jdef - gentle ping!

jdef · 2021-11-09T12:58:44Z

Not sure why I'm pinged here. Afaict this issue / ticket has been settled.

…

On Tue, Nov 9, 2021, 12:27 AM Sumukh Shivaprakash ***@***.***> wrote: @jdef <https://github.com/jdef> - gentle ping! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#496 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR5KLB4I6BBOTOCUPHCNU3ULCWMBANCNFSM5HOU3QLQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

sumukhs · 2021-11-09T13:03:52Z

I can think of passing in an optional "etag" that if absent will be treated as the safety net/human intervention where the call will be honored and the unpublish will be accepted.

However, if the value is set to something (which will be the case in the stale api call), it is ignored due to the etag checks. It is similar to database optimistic concurrency. Would this address the drawback of the principles of csi you pointed out in my original proposal?

Wondering what your take on this approach is:

jdef · 2021-11-09T14:43:08Z

I think I misread the original proposal, mistakenly confusing volume-context for publish-context. That said...

"etag" sounds like additional state that:
(a) CO would need to track/remember as a result of controller-publish-volume, so that...
(b) CO would present to the controller-unpublish-volume call

"etag" seems like "special publish volume context". Let's avoid special case things like this.

So let's say that publish-context was added as an always-OPTIONAL field for controller-unpublish-volume:

message ControllerUnpublishVolumeRequest {
...
  map<string,string> publish_context = x; // OPTIONAL, always
}

... and the new RPC semantics become ...

when publish-context is empty:
(a) because the plugin never provided one as a result of controller-publish-volume; no change from today's behavior
(b) because the CO forgot the old value; no change from today's behavior
(c) because the CO implements an older version of CSI, it is not aware of publish-context for this unpublish call; no change in today's behavior

when publish-context is non-empty:
(d) the CO remembered it from the publish RPC, and ...
(d.1) it matches what the plugin expects; no change from today's behavior
(d.2) it does NOT match what the plugin expects; ** new behavior: RPC results in FAILED_PRECONDITION **
(e) the plugin is mid-upgrade, the older plugin version doesn't process the non-empty publish-context (because it implements an older CSI spec); no change from today's behavior

At first glance, this seems to be backwards compatible and the new behavior is completely opt-in from a plugin perspective.

I'm curious - did you actually run into this situation in a live cluster, or this proposal born of hypotheticals?

sumukhs · 2021-11-09T15:44:41Z

At first glance, this seems to be backwards compatible and the new behavior is completely opt-in from a plugin perspective.

Yes, given its optional behavior it would be a non-breaking change for existing plugin's/CO's. Is the expectation for the feature proposer to send out a PR/do one of the maintainers would take care of incorporating this in the next release?

I'm curious - did you actually run into this situation in a live cluster, or this proposal born of hypotheticals?

We did not hit this in a live cluster but we have a mock CO and CSI simulator that exercised this code path and ran into a coding assert.

jdef · 2021-11-09T20:58:16Z

Kindly submit a PR. If this is your first, you probably need to sign the CLA, otherwise PR cannot be accepted.

…

On Tue, Nov 9, 2021, 10:44 AM Sumukh Shivaprakash ***@***.***> wrote: At first glance, this seems to be backwards compatible and the new behavior is completely opt-in from a plugin perspective. Yes, given its optional behavior it would be a non-breaking change for existing plugin's/CO's. Is the expectation for the feature proposer to send out a PR/do one of the maintainers would take care of incorporating this in the next release? I'm curious - did you actually run into this situation in a live cluster, or this proposal born of hypotheticals? We did not hit this in a live cluster but we have a mock CO and CSI simulator that exercised this code path and ran into a coding assert. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#496 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAR5KLDZRIJZQB5DCOBJOCDULE6XLANCNFSM5HOU3QLQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Provide publish context during ControllerUnpublishVolume #496

Proposal: Provide publish context during ControllerUnpublishVolume #496

sumukhs commented Nov 5, 2021

jdef commented Nov 5, 2021 via email

sumukhs commented Nov 5, 2021

jdef commented Nov 5, 2021 via email

sumukhs commented Nov 5, 2021

sumukhs commented Nov 5, 2021

jdef commented Nov 5, 2021 via email

sumukhs commented Nov 5, 2021

jdef commented Nov 5, 2021 via email

sumukhs commented Nov 6, 2021 •

edited

Loading

sumukhs commented Nov 9, 2021

jdef commented Nov 9, 2021 via email

sumukhs commented Nov 9, 2021 •

edited

Loading

jdef commented Nov 9, 2021

sumukhs commented Nov 9, 2021

jdef commented Nov 9, 2021 via email

Proposal: Provide publish context during ControllerUnpublishVolume #496

Proposal: Provide publish context during ControllerUnpublishVolume #496

Comments

sumukhs commented Nov 5, 2021

Problem Overview:

Expectation:

Proposal:

jdef commented Nov 5, 2021 via email

sumukhs commented Nov 5, 2021

jdef commented Nov 5, 2021 via email

sumukhs commented Nov 5, 2021

sumukhs commented Nov 5, 2021

jdef commented Nov 5, 2021 via email

sumukhs commented Nov 5, 2021

jdef commented Nov 5, 2021 via email

sumukhs commented Nov 6, 2021 • edited Loading

sumukhs commented Nov 9, 2021

jdef commented Nov 9, 2021 via email

sumukhs commented Nov 9, 2021 • edited Loading

jdef commented Nov 9, 2021

sumukhs commented Nov 9, 2021

jdef commented Nov 9, 2021 via email

sumukhs commented Nov 6, 2021 •

edited

Loading

sumukhs commented Nov 9, 2021 •

edited

Loading