Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TLS passthrough Gateway and TLSRoute to SSL-enabled PostgreSQL instance not working #4594

Open
ferdinandosimonetti opened this issue Oct 31, 2024 · 7 comments
Labels

Comments

@ferdinandosimonetti
Copy link

ferdinandosimonetti commented Oct 31, 2024

Description:

What issue is being seen?
I expected to be able to connect from the outside to an exposed PostgreSQL instance through my newly defined TLS-passthrough Gateway and my TLSRoute

Instead, I received a *server closed the connection unexpectedly immediately after trying to connect

root@d31b66b2a97c:/# psql -h store.fsmn.xyz -p 6432 -U myuser mydatabase
psql: error: connection to server at "store.fsmn.xyz" (10.111.9.116), port 6432 failed: server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.

I'm working with a VPN, hence I can reach even Gateways with private IPs (my Kubernetes clusters is an Azure AKS one).

I can still reach the same PostgreSQL instance through a TCP-enabled Gateway and TCPRoute (their configurations are shown below).

root@d31b66b2a97c:/# psql -h store.fsmn.xyz -p 5432 -U us_contributor m9sweeper
Password for user us_contributor:
psql (15.8 (Debian 15.8-0+deb12u1), server 15.3 (OnGres 15.3-build-6.30))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

m9sweeper=>

Repro steps:

Include sample requests, environment, etc. All data and inputs
required to reproduce the bug.

Below you can find my EnvoyProxy, GatewayClass, Gateway and TLSRoute

---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: eg
  namespace: envoy-gateway-system
spec:
  mergeGateways: true
  routingType: Service
  logging:
    level:
      default: debug
  provider:
    type: Kubernetes
    kubernetes:
      envoyDeployment:
        replicas: 1
        container:
          resources:
            requests:
              cpu: 150m
              memory: 500Mi
            limits:
              cpu: 150m
              memory: 500Mi
      # perchè Azure capisca se deve creare un LoadBalancer "privato" o con IP pubblicogi
      envoyService:
        annotations:
          service.beta.kubernetes.io/azure-load-balancer-internal: "true"
          service.beta.kubernetes.io/azure-load-balancer-ipv4: 10.111.9.116  
---
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: eg
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: eg
    namespace: envoy-gateway-system
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test-fsmn-xyz-stackgres-tls
  namespace: default
spec:
  gatewayClassName: eg
  listeners:
    - name: stackgres
      protocol: TLS
      port: 6432
      allowedRoutes:
        kinds:
        - kind: TLSRoute
        namespaces:
          from: All
      tls:
        mode: Passthrough
---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TLSRoute
metadata:
  name: tlsroute-store
  namespace: stackgres-dev
spec:
  parentRefs:
    - name: test-fsmn-xyz-stackgres-tls
      namespace: default
  hostnames:
    - "store.fsmn.xyz"
  rules:
    - backendRefs:
        - group: ""
          kind: Service
          name: store
          port: 5432
          weight: 1

Note: If there are privacy concerns, sanitize the data prior to
sharing.

Environment:

Include the environment like gateway version, envoy version and so on.

I'm using Helm to install Envoy Gateway, the Helm Chart version is v1.1.2
My Kubernetes cluster is an Azure AKS

This other combination of TCP-enabled Gateway and TCPRoute allows me to reach the PostgreSQL instance correctly.

---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: test-fsmn-xyz-stackgres
  namespace: stackgres-dev
spec:
  gatewayClassName: eg
  listeners:
    - name: stackgres
      protocol: TCP
      port: 5432
      allowedRoutes:
        kinds:
        - kind: TCPRoute
---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: TCPRoute
metadata:
  name: stackgres-dev-store
  namespace: stackgres-dev
spec:
  parentRefs:
  - name: test-fsmn-xyz-stackgres
    sectionName: stackgres
  rules:
  - backendRefs:
        - group: ""
          kind: Service
          name: store
          port: 5432

**I had exactly the same results (TLSRoute unreachable, server closed the connection unexpectedly) when I tried to follow the tutorial here

Logs:

Include the access logs and the Envoy logs.

Each time I try to connect through the TLS passthrough gateway and TLSRoute I see no log entries (even with debug set for log verbosity)

Here are the kubectl describe outputs for both my TLSRoute and TLS-passthrough enabled Gateway.

k describe tlsroute -n stackgres-dev
Name:         tlsroute-store
Namespace:    stackgres-dev
Labels:       <none>
Annotations:  <none>
API Version:  gateway.networking.k8s.io/v1alpha2
Kind:         TLSRoute
Metadata:
  Creation Timestamp:  2024-10-31T10:10:10Z
  Generation:          1
  Resource Version:    295799376
  UID:                 2364d608-b703-480c-bd9e-9b861cb5d3aa
Spec:
  Hostnames:
    store.fsmn.xyz
  Parent Refs:
    Group:      gateway.networking.k8s.io
    Kind:       Gateway
    Name:       test-fsmn-xyz-stackgres-tls
    Namespace:  default
  Rules:
    Backend Refs:
      Group:
      Kind:    Service
      Name:    store
      Port:    5432
      Weight:  1
Status:
  Parents:
    Conditions:
      Last Transition Time:  2024-10-31T10:10:10Z
      Message:               Route is accepted
      Observed Generation:   1
      Reason:                Accepted
      Status:                True
      Type:                  Accepted
      Last Transition Time:  2024-10-31T10:10:10Z
      Message:               Resolved all the Object references for the Route
      Observed Generation:   1
      Reason:                ResolvedRefs
      Status:                True
      Type:                  ResolvedRefs
    Controller Name:         gateway.envoyproxy.io/gatewayclass-controller
    Parent Ref:
      Group:      gateway.networking.k8s.io
      Kind:       Gateway
      Name:       test-fsmn-xyz-stackgres-tls
      Namespace:  default
Events:           <none>
k describe gateway/test-fsmn-xyz-stackgres-tls
Name:         test-fsmn-xyz-stackgres-tls
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  gateway.networking.k8s.io/v1
Kind:         Gateway
Metadata:
  Creation Timestamp:  2024-10-31T10:06:26Z
  Generation:          2
  Resource Version:    295799375
  UID:                 a2b58ba0-ef3a-46ab-ad23-223850934a63
Spec:
  Gateway Class Name:  eg
  Listeners:
    Allowed Routes:
      Kinds:
        Group:  gateway.networking.k8s.io
        Kind:   TLSRoute
      Namespaces:
        From:  All
    Name:      stackgres
    Port:      6432
    Protocol:  TLS
    Tls:
      Mode:  Passthrough
Status:
  Addresses:
    Type:   IPAddress
    Value:  10.111.9.116
  Conditions:
    Last Transition Time:  2024-10-31T10:10:10Z
    Message:               The Gateway has been scheduled by Envoy Gateway
    Observed Generation:   2
    Reason:                Accepted
    Status:                True
    Type:                  Accepted
    Last Transition Time:  2024-10-31T10:10:10Z
    Message:               Address assigned to the Gateway, 1/1 envoy Deployment replicas available
    Observed Generation:   2
    Reason:                Programmed
    Status:                True
    Type:                  Programmed
  Listeners:
    Attached Routes:  1
    Conditions:
      Last Transition Time:  2024-10-31T10:10:10Z
      Message:               Sending translated listener configuration to the data plane
      Observed Generation:   2
      Reason:                Programmed
      Status:                True
      Type:                  Programmed
      Last Transition Time:  2024-10-31T10:10:10Z
      Message:               Listener has been successfully translated
      Observed Generation:   2
      Reason:                Accepted
      Status:                True
      Type:                  Accepted
      Last Transition Time:  2024-10-31T10:10:10Z
      Message:               Listener references have been resolved
      Observed Generation:   2
      Reason:                ResolvedRefs
      Status:                True
      Type:                  ResolvedRefs
    Name:                    stackgres
    Supported Kinds:
      Group:  gateway.networking.k8s.io
      Kind:   TLSRoute
Events:       <none>
@arkodg
Copy link
Contributor

arkodg commented Nov 1, 2024

the only difference between TCPRoute and TLSRoute is an additional TLS Inspector filter that performs a SNI check


cc @cpakulski

@ferdinandosimonetti
Copy link
Author

ferdinandosimonetti commented Nov 5, 2024 via email

@cpakulski
Copy link

@ferdinandosimonetti yes, you are correct. I have not initially understood the problem you reported, but after your investigation it is clear why it does not work.
Sometime ago I investigated possibility to "wait" with upstream cluster selection (routing) until Envoy completes STARTTLS negotiation with the client, but it was not trivial and work has been put on hold.

@arkodg
Copy link
Contributor

arkodg commented Nov 5, 2024

hey @cpakulski is there an open issue in envoy proxy that we can link this issue to ?

@cpakulski
Copy link

Not exact, but somehow related: envoyproxy/envoy#32954

@ferdinandosimonetti
Copy link
Author

@ferdinandosimonetti yes, you are correct. I have not initially understood the problem you reported, but after your investigation it is clear why it does not work. Sometime ago I investigated possibility to "wait" with upstream cluster selection (routing) until Envoy completes STARTTLS negotiation with the client, but it was not trivial and work has been put on hold.

If there would be a way to ask PostgreSQL to listen in TLS mode directly, even on a different port... then I could use several TLSRoutes with a single listener, for reaching the different PostgreSQL environments (dev, stage, prod) exposing a single IP

So far, I haven't been able to understand how it could be possible, and if it could be possible.

@cpakulski
Copy link

I could configure downstream transport socket to be TLS (not STARTTLS), so you could select route based on SNI. But you would need non-standard postgres client, as standard one uses STARTTLS. Or maybe you can construct your client to send postgres traffic in clear and that traffic is forwarded to a socket which implements TLS. In that scenario, you do not even need postgres filter and forwarding in Envoy would only need tcp_proxy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants