Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meshregistrator #80

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

meshregistrator #80

wants to merge 1 commit into from

Conversation

mks-m
Copy link
Contributor

@mks-m mks-m commented Apr 27, 2021

  • fetch pods out of kubelet
  • fetch puppet services
  • fetch AWS CloudMap registrations for current host
  • write updated AWS CloudMap registrations


type ServiceRegistrationsList struct {
sync.Mutex
IsDirty bool
Copy link
Contributor

@vkhromov vkhromov Apr 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIR bare bool isn't atomic in Golang, there is no guaranty that its changes will be visible in other execution units.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsDirty is only written wrapped in Lock/Unlock, is that still problematic? Stale value during reads shouldn't be an issue, might cause unnecessary updates to cloudmap, but w/e.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be an issue.

An execution unit doesn't know anything about other users of the variable, so it could assume that it's the only user of the variable. Assuming that, it could load the variable value into a CPU register, and then never read from or write to the variable's original memory location. As a result, execution units will use their own version of variable stored in their own copies of registers, and e.g. one of them will never see changes made by another one.
Lock/Unlock doesn't do anything with that issue, they can prevent concurrent changes of the same memory location and partial reads, but cannot prevent caching in registers and cannot enforce reading/writing from/to memory.

I'm not 100% sure that all of that is actually happen in Golang, but short googling of something like "golang atomic bool" shows both special atomic functions for bool and some discussions about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intresting.. i tried reproducing this behavior and failed: https://play.golang.org/p/X3i4glvkpQF, it's sending an int around a channel loop between few goroutines and randomly increases the int and flips the value of bool such that bool always matches int parity. if any goroutine was caching the value of bool - we'd see it not matching the parity of an int, but so far it didn't happen. so maybe not an issue in golang?

also, i can't imagine the worst case being anything worse than missing an update or two, which would mean up to 10 seconds stale data in cloudmap - i guess we can live with that?

Copy link
Contributor

@vkhromov vkhromov Apr 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is an another one example of an issue with concurrent reading and writing of bool: https://news.ycombinator.com/item?id=17350776
Reading isn't protected by mutex in this code in the PR:

if !pods.IsDirty && !local.IsDirty && !aws.IsDirty {

Anyway, I'm really not ready to invest so much energy into this discussion. :)
If you think that this code is fine, let's move on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've replaced the whole thing with contexts and channels which is more idiomatic in go, thanks for your input!

@mks-m mks-m force-pushed the u/maksym/meshregistrator branch from 768bd26 to 2b235d9 Compare April 29, 2021 09:29
Copy link
Contributor

@Rob-Johnson Rob-Johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have some existing functionality in paasta_tools for what you're building here (particularly the discovery for k8s, puppet, etc) - are there any other consumers of that which we need to be aware of to migrate them? are you just intending to replace that code totally? your v0 of this could just be to call those functions and push them to aws rather than implementing the fetching of backends to be registered yourself

return
case <-ticker.C:
startTime := time.Now().UnixNano()
resp, err := http.Get("http://127.0.0.1:10255/pods")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised you don't need creds to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not for local kubelet api apparently, same code in paasta-tools

var port int32
for _, cont := range pod.Spec.Containers {
// TODO: use instance name?
if cont.Name != HacheckPodName {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the future, we'll probably also want to inspect the state of the readiness check to assert whether we should register?

}

// compare registration slices ignoring order
func registrationsEqual(x, y []ServiceRegistration) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the performance win from hashing really worthwhile (as opposed to just comparing each field in the ServiceRegistration)? I'm almost certain at somepoint we'll be debugging why something did/didn't change and we'll want to see which of the fields being compared was different and have to write some code to assist in debugging

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not actually about performance, just a set-equality algorithm, but since go doesn't have generics i need to have this =/ naive implementation is potentially quadratic

)

// check host for services configured via other means and update registration list
func fetchLocal(ctx context.Context, out chan []ServiceRegistration, servicesDir string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what other means are there for registering in a namespace other than either paasta or puppet (what's a local service)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i could have used "puppet" although services are actually defined in "nerve" folder and i wasn't sure there isn't or won't be another way to register something locally, so i went with most generic name i could think of

@mks-m
Copy link
Contributor Author

mks-m commented Apr 30, 2021

your v0 of this could just be to call those functions and push them to aws

@Rob-Johnson yeah, we can definitely save some time reusing paasta-tools code, but i wouldn't want to make this depend on nerve-tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants