Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wait until actually stopped to tell windows we are #2040

Closed
wants to merge 47 commits into from

Conversation

tewinget
Copy link
Collaborator

@tewinget tewinget commented Oct 27, 2022

We should send STOP_PENDING rather than STOPPED while we aren't yet actually stopped; STOPPED is already sent when we actually finish stopping.

Also fixes some silly argc/argv shenanigans, I think.

TODO:

  • delay windows service notification until after we are fully initialized so that if we fail (e.g. can't start wintun, config error, etc.) we won't have told windows that we started (and it can properly report a failure to start the service).

daemon/lokinet.cpp Outdated Show resolved Hide resolved
@majestrate majestrate force-pushed the windows-service-issues branch from c2432bd to 3dbff19 Compare October 27, 2022 13:19
We should send STOP_PENDING rather than STOPPED while we aren't yet
actually stopped; STOPPED is already sent when we actually finish
stopping.

Also fixes some silly argc/argv shenanigans, I think.
@majestrate majestrate force-pushed the windows-service-issues branch from 3dbff19 to a9a2a11 Compare October 27, 2022 13:32
majestrate and others added 12 commits October 27, 2022 10:54
report status to window service manager when we get and iterogate
message from the service manager.
update comments to reflect these changes.
add an interface type for platform dependant system layer behavior.
this gives lokinet a platform agnostic way to inform the system layer
of its current state.
- win32_platform.cpp is dead
- win32_platform.hpp is useless

Style changes from clang-tidy warnings:
- remove `virtual` from some definitions that already have `override`
- remove virtual destructor from NetworkInterface because it already has
  a virtual destructor via the base type (and clang-tiny warns about it)
Fixes windows shutdown crashes:

- windivert wasn't handling an ERROR_NO_DATA, which it gets when
  finished handling everything after a shutdown.
- wintun ReadPacket still gets invoked after end_session is called, but
  shouldn't be.  This adds an atomic<bool> to early return.
- fixes up some settings we send for windows service manager notify
When unbound is in threaded async mode on Windows it crashes in
ub_ctx_delete during shutdown for unknown reasons, but only when stopped
via windows service control.

Switch it to process async mode which doesn't crash.
Copy link
Contributor

@majestrate majestrate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it works now it looks fine

jagerman and others added 3 commits October 28, 2022 11:51
If wintun fails it seems to take about 15s, so extend the startup
timeout so that it can fail gracefully (and let us clean up before
exiting).

Also refactors the timeouts to chrono constants.
* move nop service manager to its own compilation unit and remove old
compilation units (touches cmake)
* change service manager interface to also be able to report stats
* fixup service manager interface's method names
* move sd_notify to service manager
llarp/router/router.cpp Outdated Show resolved Hide resolved
llarp/router/router.cpp Outdated Show resolved Hide resolved
* remove old compilation unit
* move systemd specific parts of status to service manager
* make router status not systemd specific
llarp/router/router.cpp Outdated Show resolved Hide resolved
majestrate and others added 2 commits October 28, 2022 12:53
use `fmt::join` for lokinet version string in status

Co-authored-by: Jason Rhinelander <[email protected]>
jagerman and others added 26 commits October 28, 2022 21:45
Get rid of the --win32-daemon hack (which was removed from the service
itself earlier in this PR, by mistake) and replace it with detection of
the error code for "not running as a service" that windows gives us back
if we try to set up service controller dispatching but aren't a service.
we were calling llarp::Context::HandleSignal from a non mainloop
thread when running as a win32 service. this caused issues with a non
clean destruction.

call our signal handler instead of llarp::Context::HandleSignal
If already below info (e.g. debug) it should stay there; we only want to
*lower* it to info if above info.
We do and should be able to call this multiple times during shutdown to
signal that we are advancing through shutdown.
Occasionally during shutdown windivert will crash because a thread tries
sending after we've called wd::shutdown, which isn't allowed.  Add an
atomic bool to prevent this.
… win32 make sure we do not query when we are shutting down
…cause the query is kept alive by it (sometimes)
Query->Cancel() will remove the Query, but that introduces a race
condition where unbound may still try to invoke the callback (with a
no-longer-valid pointer) if we do it before the ub_ctx_delete call.

Move to it afterwards so that we only cancel things that unbound didn't
We need to make a copy here because (see comment).
@tewinget
Copy link
Collaborator Author

tewinget commented Nov 1, 2022

supplanted by #2045

@tewinget tewinget closed this Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants