Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: manage mutagen daemon lifecycle #98

Merged
merged 18 commits into from
Mar 12, 2025
Merged

Conversation

ethanndickson
Copy link
Member

@ethanndickson ethanndickson commented Mar 7, 2025

Closes coder/internal#381.

  • Moves the VPN-specific app files into a VPN folder.
  • Adds an empty Resources folder whose contents are copied into the bundle at build time.
  • Adds a MutagenDaemon abstraction for managing the mutagen daemon lifecycle, this class:
    • Starts the mutagen daemon using mutagen daemon run, with a MUTAGEN_DATA_DIRECTORY in Application Support/Coder Desktop/Mutagen, to avoid collisions with a system mutagen using ~/.mutagen.
    • Maintains a gRPC connection to the daemon socket.
    • Stops the mutagen daemon over gRPC
    • Relays stdout & stderr from the daemon, and watches if the process exits unexpectedly.
    • Handles replacing an orphaned mutagen daemon run process if one exists.

This PR does not embed the mutagen binaries within the bundle, it just handles the case where they're present.

Why is the file sync code in VPNLib?

When I had the FileSync code (namely protobuf definitions) in either:

  • The app target
  • A new FSLib framework target

Either the network extension crashed (in the first case) or the app crashed (in the second case) on launch.
The crash was super obtuse:

Library not loaded: @rpath/SwiftProtobuf.framework/Versions/A/SwiftProtobuf

especially considering SwiftProtobuf doesn't have a stable ABI and shouldn't be compiled as a framework.

At least one other person has ran into this issue when importing SwiftProtobuf multiple times:
apple/swift-protobuf#1506 (comment)

Curiously, this also wasn't happening on local development builds (building and running via the XCode GUI), only when exporting via our build script.

Solution

We're just going to overload VPNLib as the source of all our SwiftProtobuf & GRPC code. Since it's pretty big, and we don't want to embed it twice, we'll embed it once within the System Extension, and then have the app look for it in that bundle, see LD_RUNPATH_SEARCH_PATHS. It's not exactly ideal, but I don't think it's worth going to war with XCode over.

TODO

Copy link
Member Author

ethanndickson commented Mar 7, 2025

@ethanndickson ethanndickson force-pushed the ethan/mutagen-lifecycle branch from 6f638fd to e9fa904 Compare March 7, 2025 03:46
@@ -4,6 +4,7 @@ import KeychainAccess
import NetworkExtension
import SwiftUI

@MainActor
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by fix: I believe this is implicit in the ObservableObject conformance, but can't hurt to make it explicit.

}

@MainActor
class MutagenDaemon: FileSyncDaemon {
Copy link
Member Author

@ethanndickson ethanndickson Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open to suggestions on how to test this. We can easily mock the gRPC calls, but mocking the child process is ???.

I've just done a good amount of manual testing for now.

Comment on lines 53 to 55
// Stop an orphaned daemon, if there is one
try? await connect()
try? await stop()
Copy link
Member Author

@ethanndickson ethanndickson Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, mutagen daemon run acquires a file lock in the MUTAGEN_DATA_DIRECTORY. Multiple instances of the daemon will fail to start, so we need to make sure one isn't running before we start ours. If, for whatever reason, we were unable to kill the daemon, the new one will immediately fail and change the DaemonState.

@ethanndickson ethanndickson self-assigned this Mar 7, 2025
@ethanndickson ethanndickson marked this pull request as ready for review March 7, 2025 04:06
Copy link
Member

@ThomasK33 ThomasK33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good on the surface, I'll have to test it locally later and look for edge cases.

@@ -0,0 +1,11 @@
syntax = "proto3";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: Haven't seen a discussion before, but have we considered using the edition2023 version instead of proto3?

@ethanndickson ethanndickson requested review from ThomasK33 and spikecurtis and removed request for deansheather March 11, 2025 05:28
Copy link
Member

@ThomasK33 ThomasK33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think there's an issue in the Coder_DesktopApp.swift file with quitting the VPN on quit

Copy link
Member

@ThomasK33 ThomasK33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member Author

ethanndickson commented Mar 12, 2025

Merge activity

  • Mar 12, 1:43 AM EDT: A user started a stack merge that includes this pull request via Graphite.
  • Mar 12, 1:43 AM EDT: A user merged this pull request with Graphite.

@ethanndickson ethanndickson merged commit 6f6049e into main Mar 12, 2025
4 checks passed
@ethanndickson ethanndickson deleted the ethan/mutagen-lifecycle branch March 12, 2025 05:43
ethanndickson added a commit that referenced this pull request Mar 24, 2025
This makes a few improvements to #98:
- The mutagen path & data directory can be now be configured on the MutagenDaemon, to support overriding it in tests (coming soon).
- A mutagen daemon failure now kills the process, such that can it be restarted (TBC).
- Makes start & stop transitions mutually exclusive via a semaphore, to account for actor re-entrancy.
- The start operation now waits for the daemon to respond to a version request before completing.
- The daemon is always started on launch, but then immediately stopped if it doesn't manage any file sync sessions, as to not run in the background unncessarily.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

manage mutagen daemon lifecycle for coder-desktop-macos
2 participants