Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS: peers inaccessible after long uptime #283

Open
mcginty opened this issue Aug 24, 2023 · 4 comments
Open

macOS: peers inaccessible after long uptime #283

mcginty opened this issue Aug 24, 2023 · 4 comments
Labels
bug Something isn't working mac macOS-specific

Comments

@mcginty
Copy link
Collaborator

mcginty commented Aug 24, 2023

Unfortunately I haven't found the root cause yet, but occasionally innernet on macOS will get into a state where peers are no longer accessible, and innernet up doesn't fix the problem if you don't do innernet down before hand to reset all the states/tunnels.

The test innernet subnet is fd00:1337::/48.

I just hit this condition, so dumping some debug output here for later investigation.

tldr: at first glance, the routes and interfaces look normal, but I can't even ping my local wireguard IP via the loopback interface (lo0). Something is weird.

netstat -rn output:

Internet6:
Destination                             Gateway                         Flags           Netif Expire
default                                 fe80::%utun0                    UGcIg           utun0
default                                 fe80::%utun1                    UGcIg           utun1
default                                 fe80::%utun2                    UGcIg           utun2
default                                 fe80::%utun3                    UGcIg           utun3
default                                 fe80::%utun4                    UGcIg           utun4
::1                                     ::1                             UHL               lo0
fd00:1337::/48                          fe80::fa4d:89ff:fe85:2c87%utun5 Uc              utun5
fd00:1337:0:1:1::1                      link#26                         UHL               lo0

...truncated...

WireGuard information

$ ps aux | grep wireguard-go
root             25514   0.0  0.1 409219344  10144   ??  S    Tue08AM  14:41.65 wireguard-go utun
$ wireguard-go --version
wireguard-go v0.0.20230223
$ sudo wg
interface: utun5
  public key: <redacted>
  private key: (hidden)
  listening port: 60774

peer: <redacted>
  endpoint: <redacted>:51820
  allowed ips: fd00:1337::1/128
  latest handshake: 21 hours, 16 minutes, 31 seconds ago
  transfer: 16.56 GiB received, 838.20 MiB sent
  persistent keepalive: every 25 seconds
$ ifconfig utun5
utun5: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> mtu 1280
        inet6 fe80::fa4d:89ff:fe85:2c87%utun5 prefixlen 64 scopeid 0x1a
        inet6 fd00:1337:0:1:1::1 prefixlen 48
        nd6 options=201<PERFORMNUD,DAD>
$ route -n get -inet6 fd00:1337:0:1:1::1
   route to: fd00:1337:0:1:1::1
destination: fd00:1337:0:1:1::1
  interface: lo0
      flags: <UP,HOST,DONE,LLINFO,LOCAL>
 recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
       0         0         0         0         0         0     16384         0

$ route -n get -inet6 fd00:1337::1
   route to: fd00:1337::1
destination: fd00:1337::1
  interface: utun5
      flags: <UP,HOST,DONE,WASCLONED,IFSCOPE,IFREF>
 recvpipe  sendpipe  ssthresh  rtt,msec    rttvar  hopcount      mtu     expire
       0         0         0         0         0         0      1280         0
@mcginty mcginty added the bug Something isn't working label Aug 24, 2023
@strohel
Copy link
Member

strohel commented Aug 24, 2023

I think @goodhoko might have had a similar issue. Though for him "long uptime" was just minutes, so not sure if the same cause.

@goodhoko
Copy link
Member

goodhoko commented Aug 28, 2023

My problem may be just me not using innernet correctly. IDK why (I think someone told me it's fine to) but I used to ctrl+c the innernet up command before it established connection with all peers and ran to completion. This leaves the network stopped (even if it was previously running).

It adds to this confusion that the network works fine while innernet up is trying to establish connections. I tend to jump between terminal tabs a lot and I often started using innernet before innernet up finished in another tab. Thinking it's all set up I jumped back and killed innernet up stopping the network again.

Unrelated to this issue, but maybe we could either make innernet up handle SIGINT more gracefully in the phase of establishing connections with peers, or just print something like Received ctrl+c. Stopping XXX network. so that it's clear what's happening. Shall I create an issue for that?

I'm on mac with wireguard-go.

@strohel
Copy link
Member

strohel commented Aug 28, 2023

My problem may be just me not using innernet correctly. IDK why (I think someone told me it's fine to) but I used to ctrl+c the innernet up command before it established connection with all peers and ran to completion. This leaves the network stopped (even if it was previously running).

It definitely behaves differently for me: I can Ctrl+C it right after

strohel@thicky ~/work/portal $ innernet up
[*] fetching state for tonari from server...
[*]   peer dev-pablo-portal (9Wj1oUXCWW...) was modified.
[*]     Endpoint: 81.34.16.254:35626 => 2.138.197.161:49285
[*]   peer jen (gUwOAMVBQW...) was modified.
[*]     Endpoint: 192.168.1.1:52495 => 37.143.115.174:53063
[*]   peer taj (5iuERx/Z7v...) was modified.
[*]     Endpoint: 111.216.113.76:58107 => 133.201.82.64:32789

[*] updated interface tonari

[*] reporting 2 interface addresses as NAT traversal candidates

and the network stays up, most/all peers connected (just probably misses some NAT traversal oportunities, but that's a corner-case).

Linux, kernel-space wireguard here.

@mcginty
Copy link
Collaborator Author

mcginty commented Aug 31, 2023

@goodhoko that happens to me too, I think like you said we just need to handle SIGINT more intelligently. I'll open a separate issue for that, because this is a whole different beast...

@mcginty mcginty added the mac macOS-specific label Sep 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working mac macOS-specific
Projects
None yet
Development

No branches or pull requests

3 participants