Skip to content

Clarify guidance for pod network routing, add resources to Overview #1035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: mainline
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 34 additions & 10 deletions latest/ug/nodes/hybrid-nodes-networking.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -18,26 +18,48 @@ image::images/hybrid-prereq-diagram.png[Hybrid node network connectivity.,scaled
[#hybrid-nodes-networking-on-prem]
== On-premises networking configuration

*Minimum network requirements*
[#hybrid-nodes-networking-min-reqs]
=== Minimum network requirements

For an optimal experience, {aws} recommends reliable network connectivity of at least 100 Mbps and a maximum of 200ms round trip latency for the hybrid nodes connection to the {aws} Region. The bandwidth and latency requirements can vary depending on the number of hybrid nodes and your workload characteristics such as application image size, application elasticity, monitoring and logging configurations, and application dependencies on accessing data stored in other {aws} services.
For an optimal experience, it is recommended to have reliable network connectivity of at least 100 Mbps and a maximum of 200ms round trip latency for the hybrid nodes connection to the {aws} Region. This is general guidance that accommodates most use cases but is not a strict requirement. The bandwidth and latency requirements can vary depending on the number of hybrid nodes and your workload characteristics, such as application image size, application elasticity, monitoring and logging configurations, and application dependencies on accessing data stored in other {aws} services. It is recommended to test with your own applications and environments before deploying to production to validate that your networking setup meets the requirements for your workloads.

*On-premises node and pod CIDRs*
[#hybrid-nodes-networking-on-prem-cidrs]
=== On-premises node and pod CIDRs

Identify the node and pod CIDRs you will use for your hybrid nodes and the workloads running on them. The node CIDR is allocated from your on-premises network and the pod CIDR is allocated from your Container Network Interface (CNI) if you are using an overlay network for your CNI. You pass your on-premises node CIDRs and optionally pod CIDRs as inputs when you create your EKS cluster with the `RemoteNodeNetwork` and `RemotePodNetwork` fields.
Identify the node and pod CIDRs you will use for your hybrid nodes and the workloads running on them. The node CIDR is allocated from your on-premises network and the pod CIDR is allocated from your Container Network Interface (CNI) if you are using an overlay network for your CNI. You pass your on-premises node CIDRs and pod CIDRs as inputs when you create your EKS cluster with the `RemoteNodeNetwork` and `RemotePodNetwork` fields. Your on-premises node CIDRs must be routable on your on-premises network. See the following section for information on the on-premises pod CIDR routability.

The on-premises node and pod CIDR blocks must meet the following requirements:

1. Be within one of the following `IPv4` RFC-1918 ranges: `10.0.0.0/8`, `172.16.0.0/12`, or `192.168.0.0/16`.
2. Not overlap with each other, the VPC CIDR for your EKS cluster, or your Kubernetes service `IPv4` CIDR.

If your CNI performs Network Address Translation (NAT) for pod traffic as it leaves your on-premises hosts, you do not need to make your pod CIDR routable on your on-premises network or configure your EKS cluster with your _remote pod network_ for hybrid nodes to become ready to workloads. If your CNI does not use NAT for pod traffic as it leaves your on-premises hosts, your pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network for hybrid nodes to become ready to workloads.
[#hybrid-nodes-networking-on-prem-pod-routing]
=== On-premises pod network routing

There are several techniques you can use to make your pod CIDR routable on your on-premises network including Border Gateway Protocol (BGP), static routes, or other custom routing solutions. BGP is the recommended solution as it is more scalable and easier to manage than alternative solutions that require custom or manual route configuration. {aws} supports the BGP capabilities of Cilium and Calico for advertising hybrid nodes pod CIDRs, see <<hybrid-nodes-cni, Configure CNI for hybrid nodes>> for more information.
When using EKS Hybrid Nodes, it is generally recommended to make your on-premises pod CIDRs routable on your on-premises network to enable full cluster communication and functionality between cloud and on-premises environments.

If you are running webhooks on hybrid nodes, your pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network so the EKS control plane can directly communicate with the webhooks running on hybrid nodes. If you cannot make your pod CIDR routable on your on-premises network but need to run webhooks, it is recommended to run webhooks on cloud nodes in the same EKS cluster. For more information on running webhooks on cloud nodes, see <<hybrid-nodes-webhooks, Configure webhooks for hybrid nodes>>.
*Routable pod networks*

*Access required during hybrid node installation and upgrade*
If you are able to make your pod network routable on your on-premises network, follow the guidance below.

1. Configure the `RemotePodNetwork` field for your EKS cluster with your on-premises pod CIDR, your VPC route tables with your on-premises pod CIDR, and your EKS cluster security group with your on-premises pod CIDR.
2. There are several techniques you can use to make your on-premises pod CIDR routable on your on-premises network including Border Gateway Protocol (BGP), static routes, or other custom routing solutions. BGP is the recommended solution as it is more scalable and easier to manage than alternative solutions that require custom or manual route configuration. {aws} supports the BGP capabilities of Cilium and Calico for advertising pod CIDRs, see <<hybrid-nodes-cni>> and <<hybrid-nodes-concepts-k8s-pod-cidrs>> for more information.
3. Webhooks can run on hybrid nodes as the EKS control plane is able to communicate with the Pod IP addresses assigned to the webhooks.
4. Workloads running on cloud nodes are able to communicate directly with workloads running on hybrid nodes in the same EKS cluster.
5. Other AWS services, such as AWS Application Load Balancers and Amazon Managed Service for Prometheus, are able to communicate with workloads running on hybrid nodes to balance network traffic and scrape pod metrics.

*Unroutable pod networks*

If you are _not_ able to make your pod networks routable on your on-premises network, follow the guidance below.

1. Webhooks cannot run on hybrid nodes because webhooks require connectivity from the EKS control plane to the Pod IP addresses assigned to the webhooks. In this case, it is recommended to run webhooks on cloud nodes in the same EKS cluster as your hybrid nodes, see <<hybrid-nodes-webhooks>> for more information.
2. Workloads running on cloud nodes are not able to communicate directly with workloads running on hybrid nodes when using the VPC CNI for cloud nodes and Cilium or Calico for hybrid nodes.
3. Use Service Traffic Distribution to keep traffic local to the zone it is originating from. For more information on Service Traffic Distribution, see <<hybrid-nodes-service-traffic-distribution>>.
4. Configure your CNI to use egress masquerade or network address translation (NAT) for pod traffic as it leaves your on-premises hosts. This is enabled by default in Cilium. Calico requires `natOutgoing` to be set to `true`.
5. Other AWS services, such as AWS Application Load Balancers and Amazon Managed Service for Prometheus, are not able to communicate with workloads running on hybrid nodes.

[#hybrid-nodes-networking-access-reqs]
=== Access required during hybrid node installation and upgrade

You must have access to the following domains during the installation process where you install the hybrid nodes dependencies on your hosts. This process can be done once when you are building your operating system images or it can be done on each host at runtime. This includes initial installation and when you upgrade the Kubernetes version of your hybrid nodes.

Expand Down Expand Up @@ -96,7 +118,8 @@ You must have access to the following domains during the installation process wh
^2^ Access to the {aws} IAM endpoints are only required if you are using {aws} IAM Roles Anywhere for your on-premises IAM credential provider.
====

*Access required for ongoing cluster operations*
[#hybrid-nodes-networking-access-reqs-ongoing]
=== Access required for ongoing cluster operations

The following network access for your on-premises firewall is required for ongoing cluster operations.

Expand Down Expand Up @@ -201,7 +224,8 @@ Depending on your choice of CNI, you need to configure additional network access
^1^ The IPs of the EKS cluster. See the following section on Amazon EKS elastic network interfaces.
====

*Amazon EKS network interfaces*
[#hybrid-nodes-networking-eks-network-interfaces]
=== Amazon EKS network interfaces

Amazon EKS attaches network interfaces to the subnets in the VPC you pass during cluster creation to enable the communication between the EKS control plane and your VPC. The network interfaces that Amazon EKS creates can be found after cluster creation in the Amazon EC2 console or with the {aws} CLI. The original network interfaces are deleted and new network interfaces are created when changes are applied on your EKS cluster, such as Kubernetes version upgrades. You can restrict the IP range for the Amazon EKS network interfaces by using constrained subnet sizes for the subnets you pass during cluster creation, which makes it easier to configure your on-premises firewall to allow inbound/outbound connectivity to this known, constrained set of IPs. To control which subnets network interfaces are created in, you can limit the number of subnets you specify when you create a cluster or you can update the subnets after creating the cluster.

Expand Down
2 changes: 1 addition & 1 deletion latest/ug/nodes/hybrid-nodes-os.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ include::../attributes.txt[]
Prepare operating system for use with Hybrid Nodes
--

Bottlerocket, Ubuntu, Red Hat Enterprise Linux (RHEL), and Amazon Linux 2023 (AL2023) are validated on an ongoing basis for use as the node operating system for hybrid nodes. {aws} supports the hybrid nodes integration with these operating systems but, with the exception of Bottlerocket, does not provide support for the operating systems itself. AL2023 is not covered by {aws} Support Plans when run outside of Amazon EC2. AL2023 can only be used in on-premises virtualized environments, reference the link:linux/al2023/ug/outside-ec2.html[Amazon Linux 2023 User Guide,type="documentation"] for more information.
Bottlerocket, Amazon Linux 2023 (AL2023), Ubuntu, and RHEL are validated on an ongoing basis for use as the node operating system for hybrid nodes. Bottlerocket is supported by {aws}in VMware vSphere environments only. AL2023 is not covered by {aws} Support Plans when run outside of Amazon EC2. AL2023 can only be used in on-premises virtualized environments, see the link:linux/al2023/ug/outside-ec2.html[Amazon Linux 2023 User Guide,type="documentation"] for more information. {aws} supports the hybrid nodes integration with Ubuntu and RHEL operating systems but does not provide support for the operating system itself.

You are responsible for operating system provisioning and management. When you are testing hybrid nodes for the first time, it is easiest to run the Amazon EKS Hybrid Nodes CLI (`nodeadm`) on an already provisioned host. For production deployments, we recommend that you include `nodeadm` in your operating system images with it configured to run as a systemd service to automatically join hosts to Amazon EKS clusters at host startup. If you are using Bottlerocket as your node operating system on vSphere, you do not need to use `nodeadm` as Bottlerocket already contains the dependencies required for hybrid nodes and will automatically connect to the cluster you configure upon host startup.

Expand Down
11 changes: 9 additions & 2 deletions latest/ug/nodes/hybrid-nodes-overview.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Join nodes from your data centers to Amazon EKS Kubernetes clusters with Amazon

With _Amazon EKS Hybrid Nodes_, you can use your on-premises and edge infrastructure as nodes in Amazon EKS clusters. {aws} manages the {aws}-hosted Kubernetes control plane of the Amazon EKS cluster, and you manage the hybrid nodes that run in your on-premises or edge environments. This unifies Kubernetes management across your environments and offloads Kubernetes control plane management to {aws} for your on-premises and edge applications.

Amazon EKS Hybrid Nodes works with any on-premises hardware or virtual machines, bringing the efficiency, scalability, and availability of Amazon EKS to wherever your applications need to run. You can use a wide range of Amazon EKS features with Amazon EKS Hybrid Nodes including Amazon EKS add-ons, Amazon EKS Pod Identity, cluster access entries, cluster insights, and extended Kubernetes version support. Amazon EKS Hybrid Nodes natively integrates with {aws} services including {aws} Systems Manager, {aws} IAM Roles Anywhere, Amazon Managed Service for Prometheus, Amazon CloudWatch, and Amazon GuardDuty for centralized monitoring, logging, and identity management.
Amazon EKS Hybrid Nodes works with any on-premises hardware or virtual machines, bringing the efficiency, scalability, and availability of Amazon EKS to wherever your applications need to run. You can use a wide range of Amazon EKS features with Amazon EKS Hybrid Nodes including Amazon EKS add-ons, Amazon EKS Pod Identity, cluster access entries, cluster insights, and extended Kubernetes version support. Amazon EKS Hybrid Nodes natively integrates with {aws} services including {aws} Systems Manager, {aws} IAM Roles Anywhere, Amazon Managed Service for Prometheus, and Amazon CloudWatch for centralized monitoring, logging, and identity management.

With Amazon EKS Hybrid Nodes, there are no upfront commitments or minimum fees, and you are charged per hour for the vCPU resources of your hybrid nodes when they are attached to your Amazon EKS clusters. For more pricing information, see link:eks/pricing/[Amazon EKS Pricing,type="marketing"].

Expand All @@ -35,10 +35,17 @@ EKS Hybrid Nodes has the following high-level features:

* EKS Hybrid Nodes can be used with new or existing EKS clusters.
* EKS Hybrid Nodes is available in all {aws} Regions, except the {aws} GovCloud (US) Regions and the {aws} China Regions.
* EKS Hybrid Nodes must have a reliable connection between your on-premises environment and {aws}. EKS Hybrid Nodes is not a fit for disconnected, disrupted, intermittent or limited (DDIL) environments. If you are running in a DDIL environment, consider link:eks/eks-anywhere/[Amazon EKS Anywhere,type="marketing"].
* EKS Hybrid Nodes must have a reliable connection between your on-premises environment and {aws}. EKS Hybrid Nodes is not a fit for disconnected, disrupted, intermittent or limited (DDIL) environments. If you are running in a DDIL environment, consider link:eks/eks-anywhere/[Amazon EKS Anywhere,type="marketing"]. Reference the link:eks/latest/best-practices/hybrid-nodes-network-disconnections.html[Best Practices for EKS Hybrid Nodes,type="documentation"] for information on how hybrid nodes behave during network disconnection scenarios.
* Running EKS Hybrid Nodes on cloud infrastructure, including {aws} Regions, {aws} Local Zones, {aws} Outposts, or in other clouds, is not supported. You will be charged the hybrid nodes fee if you run hybrid nodes on Amazon EC2 instances.
* Billing for hybrid nodes starts when the nodes join the EKS cluster and stops when the nodes are removed from the cluster. Be sure to remove your hybrid nodes from your EKS cluster if you are not using them.

[#hybrid-nodes-resources]
== Additional resources

* link:https://www.eksworkshop.com/docs/networking/eks-hybrid-nodes/[**EKS Hybrid Nodes workshop**]: Step-by-step instructions for deploying EKS Hybrid Nodes in a demo environment.
* link:https://www.youtube.com/watch?v=ZxC7SkemxvU[**AWS re:Invent: EKS Hybrid Nodes**]: AWS re:Invent session introducing the EKS Hybrid Nodes launch with a customer showing how they are using EKS Hybrid Nodes in their environment.
* link:https://repost.aws/articles/ARL44xuau6TG2t-JoJ3mJ5Mw/unpacking-the-cluster-networking-for-amazon-eks-hybrid-nodes[**AWS re:Post: Cluster networking for EKS Hybrid Nodes**]: Article explaining various methods for setting up networking for EKS Hybrid Nodes.
* link:https://aws.amazon.com/blogs/containers/run-genai-inference-across-environments-with-amazon-eks-hybrid-nodes/[**AWS blog: Run GenAI inference across environments with EKS Hybrid Nodes**]: Blog post showing how to run GenAI inference across environments with EKS Hybrid Nodes.

include::hybrid-nodes-prereqs.adoc[leveloffset=+1]

Expand Down
Loading