Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Known issue: connection refused due to IPVS #138

Open
alexellis opened this issue Aug 6, 2021 · 5 comments
Open

Known issue: connection refused due to IPVS #138

alexellis opened this issue Aug 6, 2021 · 5 comments

Comments

@alexellis
Copy link
Member

alexellis commented Aug 6, 2021

Details

You may run into a known issue where the client deployment for the inlets tunnel says: connection refused

Why does this happen?

This is due to the way your Kubernetes cluster or networking driver is configured to use IPVS. In IPVS mode, outgoing traffic is redirected to the node that the pod is running on, instead of being allowed to go to your exit-server.

Most clusters use iptables, which do not cause this problem.

If you've installed Calico or configured Cilium in a certain way then it may be using IPVS.

Possible Solution

There is a workaround, which is better for production use because the token and IP of the tunnel are deterministic, and the inlets-pro helm chart can be managed through a GitOps approach using Argo or FluxCD.

  • Provision an exit-server using inletsctl, terraform, or manually
  • Then deploy the inlets-pro client using its helm chart

If anyone has suggestions on how to improve the operator so that when an external-ip is set, it can be compatible with IPVS, I'd love to hear from you here or in some other way.

Full details: https://inlets.dev/blog/2021/07/08/short-lived-clusters.html

If you want to carry on using the operator for some reason, edit the service and remove its public IP. You'll be able to see the IPs using kubectl get tunnels -A -o wide

Steps to Reproduce (for bugs)

Optionally create a multipass VM, or cloud VM:

multipass launch --cpus 2 --mem 4G -d 30G --name k3s-server
multipass exec k3s-server /bin/bash

curl -sLS https://get.arkade.dev| sudo sh
arkade get k3sup && sudo mv .arkade/bin/k3sup /usr/local/bin/
  1. Launch a cluster in IPVS mode k3sup install --local --k3s-extra-args="--kube-proxy-arg proxy-mode=ipvs"
  2. export KUBECONFIG=$(pwd)/kubeconfig
    1. Or install a networking driver which uses IPVS.
  3. Install IPVS tools: sudo apt update && sudo apt install ipvsadm
  4. Confirm IPVS is running: sudo ipvsadm -ln
  5. Install the inlets-operator
  6. Deploy and expose nginx
  7. Note the logs for the client saying connection refused when trying to connect to the remote IP address on port 8123 on DigitalOcean, Equinix Metal or whatever cloud is being used.
2021/08/03 10:13:03 Starting TCP client. Version 0.8.8 - 57580545a321dc7549a26e8008999e12cb7161de
2021/08/03 10:13:03 Licensed to: Zespre Schmidt <[email protected]>, expires: 2 day(s)
2021/08/03 10:13:03 Upstream server: my-service, for ports: 80
Error: unable to download CA from remote inlets server for auto-tls: Get "https://165.22.103.96:8123/.well-known/ca.crt": dial tcp 165.22.103.96:8123: connect: connection refused

Note: port 8123 isn't part of the LoadBalancer service, which makes this behaviour even more questionable.

Context

A few people have run into this recently, but generally this hasn't been reported by users.

@alexellis
Copy link
Member Author

alexellis commented Oct 19, 2022

In addition to just using the helm chart for inlets-pro and creating the tunnel server with inlestctl (https://inlets.dev/blog/2021/07/08/short-lived-clusters.html)

...there is another workaround. If you edit the LoadBalancer service and remove the public IP address from the service using kubectl edit, IPVS won't get in the way trying to re-route the traffic to the wrong place.

You can still get the public IP with the tunnels CRD:

kubectl get tunnels -A
NAMESPACE     NAME             SERVICE   TUNNEL   HOSTSTATUS   HOSTIP           HOSTID
kube-system   traefik-tunnel   traefik            active       165.22.119.191   321884435

I still haven't found a way to prevent / override IPVS from doing the wrong thing.

The reason that port 8123 has connection refused is that IPVS takes over and sends traffic destined to the public IP on port 8123 to the endpoint IP in Kubernetes backed by the service within the cluster.

@danielmichaels
Copy link

Chiming in here,

I am experiencing this issue.

When I follow this tutorial (i.e. using Kind) the ingress-nginx-controller-tunnel-client-xxxx pod will successfully make a connection.

However, when I use k3s (single, or multi node) I get the same connection refused error as described above.

Output from ingress-nginx-controller-tunnel-client-xxxx when using Kind:

 2022/10/22 00:28:28 Licensed to: [email protected] (Gumroad subscription)                                                                                                                                                                                                                 │
│ 2022/10/22 00:28:28 Upstream server: ingress-nginx-controller, for ports: 80, 443                                                                                                                                                                                                     │
│ inlets-pro TCP client. Copyright OpenFaaS Ltd 2022                                                                                                                                                                                                                                    │
│ time="2022/10/22 00:28:30" level=info msg="Connecting to proxy" url="wss://157.245.146.81:8123/connect"                                                                                                                                                                               │
│ time="2022/10/22 00:28:30" level=info msg="Connection established" client_id=fbc1733076984a88a018ec00f1e4010b 

Output when using k3s:

│ 2022/10/22 00:43:05 Licensed to: [email protected] (Gumroad subscription)                                                                                                                                                                                                                 │
│ 2022/10/22 00:43:05 Upstream server: ingress-nginx-controller, for ports: 80, 443                                                                                                                                                                                                     │
│ Error: unable to download CA from remote inlets server for auto-tls: Get "https://165.232.173.183:8123/.well-known/ca.crt": dial tcp 165.232.173.183:8123: connect: connection refused 

I don't know if this makes a difference, so I'll add it anyway but when I inspect the services, kind shows this:

# external-ip only shows digitalocean ip
z ❯ k get svc          
NAME                                 TYPE           CLUSTER-IP     EXTERNAL-IP                     PORT(S)                      AGE
ingress-nginx-controller             LoadBalancer   10.96.26.50    157.245.146.81,157.245.146.81   80:31785/TCP,443:32284/TCP   25m
ingress-nginx-controller-admission   ClusterIP      10.96.44.137   <none>                          443/TCP                      25m
kubernetes                           ClusterIP      10.96.0.1      <none>                          443/TCP                      28m
 

and k3s shows this:

# external-ip shows local ip and digitalocean ip
z ❯ k get svc                         
NAME                                 TYPE           CLUSTER-IP      EXTERNAL-IP                     PORT(S)                      AGE
kubernetes                           ClusterIP      10.43.0.1       <none>                          443/TCP                      10m
ingress-nginx-controller-admission   ClusterIP      10.43.63.142    <none>                          443/TCP                      9m35s
ingress-nginx-controller             LoadBalancer   10.43.155.203   192.168.20.87,165.232.173.183   80:31693/TCP,443:30926/TCP   9m35s

Deployment notes:

kind - I have followed the tutorial without issue.

k3s:

  • k3sup to provision a single node
  • my own terraform/ansible to build single and multinode k3s
  • nginx-ingress (traefik disabled)
  • using both inlets-operator and manually using inletsctl (following this guide

Both kind and k3s are on digitalocean, and I've deployed across sgp1 and lon1.

I am a pro subscriber.

@alexellis
Copy link
Member Author

alexellis commented Oct 22, 2022

Hi @danielmichaels, thanks for using inlets.

I have absolutely no issues with K3s or KinD and inlets-operator and run them on a regular basis myself.

The problem in this issue is well described and caused by IPVS. IPVS is not a default in Kubernetes, to enable it you must install a special network driver or change the kubelet flags.

If you're doing neither of these things then I'd ask you to remove your comment and raise your own issue.

inlets-operator creates VMs for tunnel servers

  1. Check the VM was created and that the inlets-pro service is running there?
  2. If you see a public IP, try connecting to it: curl -k https://165.232.173.183:8123/.well-known/ca.crt - if that fails from your own machine, then the issue is with the VM itself.

For IPVS users use: https://inlets.dev/blog/2021/07/08/short-lived-clusters.html

Whilst I don't want to hijack this issue about IPVS, let me show you that inlets-operator is working with K3s. Please follow the steps I left for you above ( 1 and 2 )

multipass launch --cpus 2 --mem 4G -d 30G

multipass exec light-cephalopod /bin/bash

curl -sLS https://get.arkade.dev | sudo sh
arkade get k3sup
sudo mv /home/ubuntu/.arkade/bin/k3sup /usr/local/bin/

k3sup install --local --k3s-channel latest

# create do.txt
mkdir -p .inlets

# create ./inlets/LICENSE

arkade install inlets-operator \
 --region lon1 \
 --provider digitalocean \
 --token-file ./do.txt

kubectl get tunnels -A -o wide -w

NAMESPACE     NAME             SERVICE   TUNNEL   HOSTSTATUS     HOSTIP   
HOSTID
kube-system   traefik-tunnel   traefik            provisioning            
322330692
kube-system   traefik-tunnel   traefik            active         
206.189.117.93   322330692

curl -i http://206.189.117.93:80
# Traefik answered

curl -i -k http://206.189.117.93:443
# Traefik answered

curl -k https://206.189.117.93:8123/.well-known/ca.crt

-----BEGIN CERTIFICATE-----
MIIDvDCCAqSgAwIBAgIRAL4BGx/MwtCUdcn4r2a8vZcwDQYJKoZIhvcNAQELBQAw
aDELMAkGA1UEBhMCR0IxFTATBgNVBAcTDFBldGVyYm9yb3VnaDEZMBcGA1UECRMQ
...
-----END CERTIFICATE-----

Alex

@alexellis alexellis changed the title Known issue: connection refused Known issue: connection refused due to IPVS Oct 22, 2022
@danielmichaels
Copy link

danielmichaels commented Oct 22, 2022

Thank you for your prompt reply.

I can confirm that when running using Multipass or Kind everything works as expected.

It still fails to connect when I provision everything inside my proxmox cluster on my LAN. The issue must be related to how I am provisioning it or something other than inlets.

Keep up the good work! 👍

@alexellis
Copy link
Member Author

Daniel can you try using the inlets-tcp-server chart instead of the operator? Create your tunnel server using inletsctl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants