Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pods with attached security groups cannot reach Pod Identity Agent link local address #2797

Open
tmehlinger opened this issue Feb 18, 2024 · 13 comments

Comments

@tmehlinger
Copy link

tmehlinger commented Feb 18, 2024

When using security groups for pods, pods with security groups attached cannot reach the Pod Identity agent on 169.254.170.23. Any pods without security groups can reach the agent without issue. The agent pods have no security groups associated and I've ensured that the security groups on failing pods permit egress traffic on TCP port 80, and my node security group permits ingress traffic on port 80 from cluster subnets. I've tried various combinations of egress/ingress from pod/node security groups, and even a blanket policy that permits traffic to/from 0/0 with no success.

I'm using the EKS Addon with the following configuration:

            {
                "enableNetworkPolicy": "true",
                "env": {
                    "ENABLE_POD_ENI": "true",
                },
                "init": {
                    "env": {
                        "DISABLE_TCP_EARLY_DEMUX": "true"
                    }
                }
            }

I've tried running the CNI with POD_SECURITY_GROUP_ENFORCING_MODE set to standard but this causes traffic to a peered VPCs to be denied in addition to pod identity traffic being dropped (and I want strict enforcement, regardless).

Reading the documentation for standard mode behavior:

inbound/outbound traffic from another pod on the same host or another service on the same host(such as kubelet/nodeLocalDNS) won't be enforced by security group rules.

My totally wild guess about what's happening is strict mode requires enforcement of security group rules and the node security group is dropping traffic destined for a link local address as invalid.

Could someone point me the right direction? Thanks!

Environment:
Kubernetes version (use kubectl version):
Server Version: version.Info{Major:"1", Minor:"28+", GitVersion:"v1.28.5-eks-5e0fdde", GitCommit:"e78a4be9da4c375a87a109e0f4a5f4a8d2bc17c0", GitTreeState:"clean", BuildDate:"2024-01-02T20:34:46Z", GoVersion:"go1.20.12", Compiler:"gc", Platform:"linux/amd64"}

CNI Version:
v1.16.2-eksbuild.1

OS (e.g: cat /etc/os-release):
AWS EKS 1.28.5 AMI.

AME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"

Kernel (e.g. uname -a):
Linux ip-10-0-128-192.us-west-2.compute.internal 5.10.205-195.807.amzn2.aarch64 #1 SMP Tue Jan 16 18:29:00 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

@tmehlinger
Copy link
Author

Not sure why I thought the node security group would come into play here... traffic should never leave the node!

However, doing a bit more digging, I've found that traffic isn't being routed correctly on pods with security groups attached. Failing pod:

# traceroute -n -T 169.254.170.23 -m 5
traceroute to 169.254.170.23 (169.254.170.23), 5 hops max, 60 byte packets
 1  * * *
 2  10.0.1.194  0.135 ms  0.106 ms  0.141 ms
 3  * * *
 4  * * *
 5  * * *

Compared to a pod with no security groups:

# traceroute -n -T 169.254.170.23 -m 5
traceroute to 169.254.170.23 (169.254.170.23), 5 hops max, 60 byte packets
 1  169.254.170.23  0.047 ms  0.008 ms  0.008 ms

So something about security group attachment also alters the route table for the pod.

@tmehlinger
Copy link
Author

Having now re-read the documentation and a few issues that seemed related, I have a functioning setup and a better understanding of CNI behavior. I've added the following options to my CNI config:

AWS_VPC_K8S_CNI_EXTERNALSNAT: 'true'
POD_SECURITY_GROUP_ENFORCING_MODE: 'standard'

The key insight came from pulling on the thread in #1384, which seems to be more or less the same issue here--pods with SGPP configured have their traffic routed over a branch ENI and thus entirely bypass the primary interface on the node, and can't be routed to the link local address used by the Pod Identity Agent without using standard mode and external SNAT.

I think the documentation should be updated to make clear that external SNAT and standard mode are required when using Pod Identities with SGPP. It would also be helpful to clarify the reasons why external SNAT is necessary. I suspect inbound communication to your pods from external VPNs, direct connections, and external VPCs will become a far less common use case than simply expecting Pod Identities to work. I think this should also be explained in the Pod Identities documentation so I've left feedback there as well.

It might also help to consider a more general solution for routing/forwarding traffic to link-local addresses. However, I'm admittedly quite naive on the practical challenges and implications of such a solution.

These suggestions aside, thank you for the good documentation and engagement in issues. I was able to figure this out without much effort. Hopefully this will help someone else running into the same problem. 😄

@jdn5126
Copy link
Contributor

jdn5126 commented Feb 19, 2024

@tmehlinger thank you for reporting this and for the impressive debugging! I went looking and it does not appear that the Pod Identity agent was covered with Security Groups for Pods, which is depressing to hear, so you are likely the first person to have tried this.

Your assessment is accurate: when a pod matches a Security Group Policy, it is associated with a branch ENI and all traffic from the pod is routed through the trunk ENI on the node to ensure that EC2 security group rules are enforced.

I will bring this up with our project manager to make sure we have this documented and a plan to come up with a more generalized solution.

As a side note, I see that you have enabled Network Policy support. Do you have plans to move away from Security Groups in favor of Kubernetes network policies? That may simplify this networking setup significantly.

@orsenthil
Copy link
Member

I think the documentation should be updated to make clear that external SNAT and standard mode are required when using Pod Identities with SGPP.

Yes, this will be the first order that will provide clarity. Thanks for the detailed report and debugging.

@tmehlinger
Copy link
Author

tmehlinger commented Feb 19, 2024

No problem, happy to help. :)

Do you have plans to move away from Security Groups in favor of Kubernetes network policies?

For intra-cluster communication, yes. However, I have pods that need to communicate with other AWS resources (RDS instances, for example) in peered/isolated VPC subnets so I'll still need SGPP.

@orsenthil orsenthil self-assigned this Feb 27, 2024
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Apr 28, 2024
Copy link

Issue closed due to inactivity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 13, 2024
@taer
Copy link

taer commented May 14, 2024

I ran into a very similar issue w/ Eks pod identity and enableing pod security groups. All while using ISTIO

So. first was my app stopped spawning. The containerCredentialProvider was getting mad about not being able to access instance metadata.

The istio proxy was complaining a lot about not getting to it's certificate backend.

The solution above didn't work. I'm not sure what the SNAT parameter is supposed to do here, but all my network traffic from the pod was on the new ENI, and thus was not part of the node's security group. So it couldn't talk to

  • the node: Things like metadataservice for pod-identity breaks
  • the cluster: It couldn't talk to the other pods on the cluster

I added the node's security group to the pod's SecurityGroupPolicy, and that worked.

On this page:
https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html

it says this
Pod security group rules aren't applied to traffic between Pods or between Pods and services, such as kubelet or nodeLocalDNS, that are on the same node. Pods using different security groups on the same node can't communicate because they are configured in different subnets, and routing is disabled between these subnets.

This doesn't seem to be true. My pod was not able to hit the Eks Pod Identity service URL when the security group was applied w/out adding the node-security group to the pod.

I ended up here with the aws-node init params, although I'm not sure which did anything useful

      - env:
        - name: DISABLE_TCP_EARLY_DEMUX
          value: "true"
        - name: ENABLE_IPv6
          value: "false"
        - name: ENABLE_POD_ENI
          value: "true"
        - name: POD_SECURITY_GROUP_ENFORCING_MODE
          value: standard
        - name: AWS_VPC_K8S_CNI_EXTERNALSNAT
          value: "true"

Version info:

$ kubectl describe daemonset aws-node --namespace kube-system | grep amazon-k8s-cni: | cut -d : -f 3
v1.18.1-eksbuild.1

@taer
Copy link

taer commented May 14, 2024

@orsenthil could we get this re-opened? This use-case is 100% optional now, but might become more important soon for us.

@orsenthil orsenthil reopened this May 15, 2024
@github-actions github-actions bot removed the stale Issue or PR is stale label Jul 26, 2024
Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days

@github-actions github-actions bot added the stale Issue or PR is stale label Sep 24, 2024
@bleggett
Copy link

bleggett commented Oct 7, 2024

Our users have encountered this (We believe) in Istio ambient as well

We SNAT kubelet health probes to link-local addresses, and if POD_SECURITY_GROUP_ENFORCING_MODE=strict (current default) those probes begin to fail.

Settting POD_SECURITY_GROUP_ENFORCING_MODE=standard resolves this, likely because of the following:

[in standard mode] inbound/outbound traffic from another pod on the same host or another service on the same host(such as kubelet/nodeLocalDNS) won't be enforced by security group rules.

We had a similar issue with Calico, and the resolution there was to change Calico to ignore link-local addresses.

Link-local addresses are by-RFC not routable outside the local link, so blocking them via pod-level security group enforcement seems wrong/unnecessary - the link-local CIDR should probably be completely ignored by AWS VPC CNI here.

Or at least, AWS VPC CNI should provide a flag or other override that allows pod security group enforcement to categorically ignore all link-local addresses.

@orsenthil
Copy link
Member

When POD_SECURITY_GROUP_ENFORCING_MODE is set to strict, we either ignore link-local addresses or provide an option to ignore link-local addresses. This can address the issue here.

@orsenthil orsenthil added this to the v1.18.8 milestone Oct 15, 2024
@bleggett
Copy link

bleggett commented Oct 15, 2024

When POD_SECURITY_GROUP_ENFORCING_MODE is set to strict, we either ignore link-local addresses or provide an option to ignore link-local addresses. This can address the issue here.

Yes, I notice that the code removes the default route rule to force all packets thru thru the trunked ENI via a lower-ordered rule. Adding back a higher-priority route rule that only handles link-local (similar to what that code already does for IPv6 gateway ICMP packets) would fix it.

Something like this: bleggett@dfbc3fb

If there's a strong case for capturing link-local traffic with SecurityGroups, a new flag is fine. I'm not sure there is a point in pushing link-local traffic through SG enforcement tho, by definition.

Given that there is already code there that excludes some kinds of traffic if POD_SECURITY_GROUP_ENFORCING_MODE=strict without having to specify an additional flag, it would be good to do the same here IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants