Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comment line was breaking config comparison #208

Closed
wants to merge 3 commits into from

Conversation

davent
Copy link

@davent davent commented Aug 24, 2016

Due to the auto-generated comment line included on the first line of the HAProxy, a restart of HAProxy will occur even when the configuration itself is identical.

Here is an example diff when no configuration has actually changed:

1c1
< # auto-generated by synapse at 2016-08-24 10:20:55 +0000

---
> # auto-generated by synapse at 2016-08-24 10:22:47 +0000

As the logic to decide whether or not the config is different is just a simple string comparison (within the write_config function) it will never return true.

@jolynch
Copy link
Collaborator

jolynch commented Aug 24, 2016

@davent I don't see how this is a bug. As far as I know it is fine to write out the same HAProxy config because writing the config doesn't actually make anything restart. My understanding is that Synapse only tells HAProxy to restart when it detects (via the stats socket) that a restart is required, or when we first start Synapse.

When Synapse decided to restart HAProxy it usually logs a line saying why (a new backend or service), are there any such log lines before your suspect restarts?

@davent
Copy link
Author

davent commented Aug 24, 2016

Because write_config(new_config) always returns true then the following line:
restart if @opts['do_reloads'] && @restart_required is execute every time. Even when no actual config changes have been made.

@jolynch
Copy link
Collaborator

jolynch commented Aug 24, 2016

write_config doesn't determine if HAProxy needs to be restarted. That's @restart_required which Synapse sets whenever it detects (via the stats socket or state file) that a restart is required. Are you running into the first time Synapse starts it restart HAProxy regardless of if the config has changed?

@jolynch
Copy link
Collaborator

jolynch commented Aug 25, 2016

@davent I believe you're running into #78, but this PR isn't the right way to fix it. Now that I'm looking at the code though we really can't just rely on the config file changing to detect a reload as there is no guarantee that the HAProxy instance running is actually running with that config (imagine that synapse writes out the config and then right before it restarts HAProxy it get's restarted or killed). Our stat socket check at the beginning only guarantees that if we're missing backends we'll restart, but we can't detect unexpected backends or misconfigured backends.

Maybe we can get the pid of HAProxy out of the stat socket and then hash that and the config to determine a "HAProxy version" and if those match we can set @restart_required to false initially... What do you think?

@jolynch
Copy link
Collaborator

jolynch commented Aug 26, 2016

@davent are you interested in getting the patch in shape or should I close this and continue in a separate PR?

I think the algorithm that works is:

  1. Immediately after restarting HAProxy and first config generation, issue "show stat" to the stats socket and read out the "Pid" field.
  2. Hash this value with the contents of the HAProxy config to produce a "Running HAProxy version", store that in the state file in restart, but read it in write_config (so we'll have to either add another state file or refactor the current one, shouldn't be hard)
  3. Set @restart_required to be false in write_config iff the running haproxy version matches the last haproxy version

The state transition is (1) pending -> (2) config written -> (3) restarted -> (4) pid identified -> (5) pid + config hashed and written as a version. If we crash at any point before 5 we will identify a mismatch on Synapse startup (because we'll have the old PID) and guarantee a restart, which I believe is a safe default.

There is some subtlety in when we read the version out, when we compute it, and how we manage the state transition of @restart_required (right now it's just a single latch, always F -> T and then T -> F only on restart, and that's changing ... so ... there is some complexity there), but I think the patch is pretty straightforward.

@igor47
Copy link
Collaborator

igor47 commented Sep 1, 2016

oh using the pid is a great hack. i like the proposed approach.

@jolynch
Copy link
Collaborator

jolynch commented Oct 7, 2016

@davent I'm going to close this and continue with the work in a separate PR. If you've got the patch feel free to resubmit here or in a different PR :-)

@jolynch jolynch closed this Oct 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants