Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sotw: Ordered ADS mode #544

Merged
merged 9 commits into from
Jun 5, 2023
Merged

sotw: Ordered ADS mode #544

merged 9 commits into from
Jun 5, 2023

Conversation

alecholmez
Copy link
Contributor

@alecholmez alecholmez commented Mar 7, 2022

introduce optional resource ordering for guaranteed eventual consistency in ADS mode. closes #526

@alecholmez
Copy link
Contributor Author

alecholmez commented Mar 7, 2022

This PR branches off the work found here: #461 and brings it up to speed with the current state of go-control-plane. Credit to the original algorithm author @fxposter (would be great if he could review)/ We've found quite a few resource ordering issues in upstream xDS conformance test suites and this will hopefully provide a remedy to some of those problems.

I tried to introduce the change in a non intrusive way, if no opts are provided to the server everything will execute as before. If the new opt sotw.WithOrderedADS() is passed to a server, the resources will be in guaranteed ordering as in the prior art PR. This provides guaranteed eventual consistency which we did not have in the previous implementation but gives up some of the pressure relief from immediate response sending.

@alecholmez alecholmez changed the title WIP: sotw: Ordered ADS mode sotw: Ordered ADS mode Mar 7, 2022
@@ -31,14 +31,11 @@ func (log logger) Warnf(format string, args ...interface{}) { log.t.Logf(format
func (log logger) Errorf(format string, args ...interface{}) { log.t.Logf(format, args...) }

func TestTTLResponse(t *testing.T) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran this file through a formatter hence these changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make a separate PR with this, and just land it?

@alecholmez alecholmez closed this Mar 7, 2022
@alecholmez alecholmez reopened this Mar 9, 2022
@alecholmez alecholmez changed the title sotw: Ordered ADS mode WIP: sotw: Ordered ADS mode Mar 17, 2022
@alecholmez alecholmez changed the title WIP: sotw: Ordered ADS mode sotw: Ordered ADS mode Mar 28, 2022
@alecholmez alecholmez force-pushed the ordered_ads branch 2 times, most recently from c52e874 to 1016a43 Compare March 31, 2022 19:14
@github-actions
Copy link

github-actions bot commented May 2, 2022

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@alecholmez
Copy link
Contributor Author

Haven’t had much bandwidth to work this recently will get back to it soon.

@marcosrmendezthd
Copy link

any progress on this?

valerian-roche
valerian-roche previously approved these changes May 30, 2023
pkg/server/sotw/v3/ads.go Show resolved Hide resolved
if w, ok := sw.watches.responders[typeURL]; ok {
// We've found a pre-existing watch, lets check and update if needed.
// If these requirements aren't satisfied, leave an open watch.
if w.nonce == "" || w.nonce == nonce {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get a mismatching nonce we just discard the new request. Shouldn't we close the stream then? This seems like a silent failure that could be very impactful if nodes are not aware their requests are ignored

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um that's a good question, I left this as the previous behavior but maybe worth an issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think an issue would be good to avoid missing this

}
lastDiscoveryResponses[resp.GetRequest().TypeUrl] = lastResponse
// increment nonce and convert it to base10
out.Nonce = strconv.FormatInt(atomic.AddInt64(&s.nonce, 1), 10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: do we need this atomic increase here? The rest of this method is not synchronized anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do need this since N number of "sends" can increment the nonce counter which is shared. I could be wrong though I ported that from the previous code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that all the "sends" run in the same goroutine on a stream, so I do not expect the need to synchronize, but might be wrong

pkg/server/sotw/v3/xds.go Show resolved Hide resolved
pkg/server/stream/v3/stream.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
pkg/cache/v3/order_test.go Outdated Show resolved Hide resolved
DebugFunc: log.Printf,
InfoFunc: log.Printf,
WarnFunc: log.Printf,
ErrorFunc: log.Printf,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this change? Can we make a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the linter was complaining about this? Honestly can't remember haha

pkg/server/config/config.go Outdated Show resolved Hide resolved
// utilized through the functional opts pattern
type Opts struct {
// If true respond to ADS requests with a guaranteed resource ordering
Ordered bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I haven't looked at ordering for a long time. My recollection was that for ADS, the client was supposed to make requests in the order that was correct for it? If that's the case, do we really need to order anything in the server?

My other query is that, if ordering is a correctess issue, then shouldn't we be correct by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I thought, the bug this closes out shows that envoy sometimes requests things out of order even in ADS.

IMO for ADS we always want to be ordered so yes. If I recall correctly when a user configures the server to run in ADS mode it'll be ordered by default (but you can override that).

valerian-roche
valerian-roche previously approved these changes Jun 5, 2023
Copy link
Contributor

@valerian-roche valerian-roche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking, just a comment for potential order issue would be nice

pkg/cache/v3/status.go Outdated Show resolved Hide resolved
pkg/server/sotw/v3/xds.go Show resolved Hide resolved
sample/bootstrap-xds.yaml Outdated Show resolved Hide resolved
Signed-off-by: Alec Holmes <[email protected]>
@alecholmez alecholmez merged commit 717ea22 into main Jun 5, 2023
@alecholmez alecholmez deleted the ordered_ads branch June 5, 2023 21:10
@jpeach jpeach mentioned this pull request Oct 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sotw enhancement go Pull requests that update Go code no stalebot
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] send EDS before CDS, CDS will not resubscribe
9 participants