Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKS Auto Mode - add access entry for auto mode node role #3241

Closed
1 task done
erezzarum opened this issue Dec 13, 2024 · 23 comments
Closed
1 task done

EKS Auto Mode - add access entry for auto mode node role #3241

erezzarum opened this issue Dec 13, 2024 · 23 comments

Comments

@erezzarum
Copy link

Description

  • ✋ I have searched the open/closed issues and my issue is not listed.

When using built-in nodepools, EKS will automatically create appropriate access entry for the Node role.
When not using any built-in nodepools, one will not be created and the NodeClass will fail, as Node role is not authorized to join nodes to the cluster.

⚠️ Note

Before you submit an issue, please perform the following first:

  1. Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
  2. Re-initialize the project root to pull down modules: terraform init
  3. Re-attempt your terraform plan or apply and check if the issue still persists

Versions

  • Module version [Required]: 20.31.3

  • Terraform version: 1.7.3

  • Provider version(s): 5.81

Reproduction Code [Required]

Steps to reproduce the behavior:

Create an EKS Auto Mode cluster without using any built-in nodepools.

Expected behavior

EKS module will create the correct access entries.

Example with CLI

aws eks create-access-entry --cluster-name <CLUSTER NAME> --principal-arn <NODE ROLE ARN> --type EC2
aws eks associate-access-policy --cluster-name <CLUSTER NAME> --principal-arn <NODE ROLE ARN> --access-scope type=cluster --policy-arn arn:aws:eks::aws:cluster-access-policy/AmazonEKSAutoNodePolicy

Actual behavior

Create access entries for node role.

Terminal Output Screenshot(s)

Additional context

@voidlily
Copy link

Thanks for finding this, I was wondering why this wasn't being created. I was having a lot of trouble figuring out why this wasn't happening unless I made a built in NodePool, which I don't use any either in my cluster.

@bryantbiggs
Copy link
Member

if you are not using the built in nodepools, you will need to create an access entry of type "EC2"

We won't be able to support this in the module since it will cause conflicts. the easiest path would be to enable the system built in node pool, and then you can re-use the EKS Auto node IAM role from the module with your custom node pools

@voidlily
Copy link

hmm, okay. I was trying to avoid using the builtin nodepools because I'm also using a custom NodeClass to add extra security groups onto my instances. here's the terraform that got me going

resource "aws_eks_access_entry" "auto_mode" {
  cluster_name  = module.eks_cluster.cluster_name
  principal_arn = module.eks_cluster.node_iam_role_arn
  type          = "EC2"
}

resource "aws_eks_access_policy_association" "auto_mode" {
  cluster_name  = module.eks_cluster.cluster_name
  policy_arn    = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSAutoNodePolicy"
  principal_arn = module.eks_cluster.node_iam_role_arn

  access_scope {
    type = "cluster"
  }
}

@bryantbiggs
Copy link
Member

bryantbiggs commented Dec 17, 2024

you only need the access entry, you do not need the policy association

Disregard, you do need the policy association per the docs

If you change the Node IAM Role associated with a NodeClass, you will need to create a new Access Entry. EKS automatically creates an Access Entry for the Node IAM Role during cluster creation. The Node IAM Role requires the AmazonEKSAutoNodePolicy EKS Access Policy. For more information, see Grant IAM users access to Kubernetes with EKS access entries.

@erezo9
Copy link

erezo9 commented Dec 19, 2024

@bryantbiggs @antonbabenko
so why create the iam role for ec2 if its not going to be used?
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/main.tf#L872

if i dont put any pools or role (which i cant point to the role automaticly created)
you create a resorce that is not going to be used

it doesnt make sense creating the role but not assigning it, even if i create the nodepools after i would expect the role to be attached to the cluster as a different access policy resource

I can make a PR that will either not create the iam role - with node pools is 0
or attach the default node iam role that was created
I have my heart for option 2 but i can make a pr for option 1 if you want

@erezzarum
Copy link
Author

erezzarum commented Dec 19, 2024

When you use one of the built-in EKS Auto Mode Node Pools, either general-purpose or system, EKS Auto Mode will automatically create & associate an access entry for that node role.
If you don't want to create the node role at all, you can disable the creation of it, but in the case you want to use your own custom node pools we give the option to create it and use it.
The situation that cause a conflict here is that when EKS Auto Mode create & associate the access entry, it doesn't remove it after removing the built-in node pools.
There is no reason or possibility to "attach" the node role to an empty node pools, as they don't exist, to use the node role in your own custom nodepool, you assign it as role in the nodeclass (EKS Auto ModeNodeClass spec)

Hope this clears it.

@erezo9
Copy link

erezo9 commented Dec 20, 2024

@erezzarum thanks for the answer
We can still have a flag that says
Associate_iam_node_role
Which defaults to false and the user needs ro put true to assign it
And have validation that if it’s true and node pools is greater than 0 fail the plan

@bryantbiggs
Copy link
Member

No, that will only cause conflicts because an association will exist in most of the cases

@erezo9
Copy link

erezo9 commented Dec 20, 2024

@bryantbiggs
what about a case like this?
associate_node_iam_role = var.create_node_iam_role && local.auto_mode_enabled && var.associate_node_iam_role && length(try(compute_config.value.node_pools, [])) == 0 && try(compute_config.value.node_role_arn,"") == ""
the only way it will fail if some one has removed all node pools and added the associate_node_iam_role to true

@bryantbiggs
Copy link
Member

No, you are missing the point. It's not a terraform problem

@erezo9
Copy link

erezo9 commented Dec 20, 2024

@bryantbiggs
if im not mistaken the only way the association will happen is if the compute config has node pool and no role was given by the user
else the role is created without getting any assignment? i dont know what else i am missing?

if its not possible, maybe this should be added to the readme? that if no node pools are selected users should attach them differntly?

@erezzarum
Copy link
Author

erezzarum commented Dec 20, 2024

You can't use EKS Auto Mode with a built-in NodePool (general-purpose, system) without giving a Role.
When you use one of the built-in NodePools, EKS Auto Mode automatically create & associate that access entry.
If you don't use the built-in NodePools at all, it doesn't, but because you don't use any built-in NodePools, there is no reason to provide to EKS Auto Mode API the Node Role, you use the Node Role when creating a custom NodePool (NodeClass).
In case you already used a built-in NodePool and decided you no longer want, EKS Auto Mode will not delete the access entry, these at some point it might create a conflict.
To make the EKS module simple and without any conflicts, it's up to the user to create the access entry & associate it in that case.

I recommend you will try and use EKS Auto Mode from the console to understand what i wrote here, it will make sense.

@erezo9
Copy link

erezo9 commented Dec 20, 2024

@erezzarum
Ye I tried it but because the role was created and not associated with the cluster it was weird to me

i will update my terraform to create another association if that’s the case from you guys
I would recommend adding this to the docs or something that the role is created and not attached to the cluster if no node pools are created

also, you have an awesome name 😁

@erezzarum
Copy link
Author

There are two roles here, you have the cluster role and a node role.
It's the same concept if you would use Karpenter.

Great name indeed :)

@andromeda306
Copy link

hmm, okay. I was trying to avoid using the builtin nodepools because I'm also using a custom NodeClass to add extra security groups onto my instances. here's the terraform that got me going

resource "aws_eks_access_entry" "auto_mode" {
  cluster_name  = module.eks_cluster.cluster_name
  principal_arn = module.eks_cluster.node_iam_role_arn
  type          = "EC2"
}

resource "aws_eks_access_policy_association" "auto_mode" {
  cluster_name  = module.eks_cluster.cluster_name
  policy_arn    = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSAutoNodePolicy"
  principal_arn = module.eks_cluster.node_iam_role_arn

  access_scope {
    type = "cluster"
  }
}

Could you clarify how this can be implemented? I've setup a cluster using the below and hit an error related to the aws_eks_access_entry of value EC2 already being taken by the default node pools. On the other hand if I leave the node_pools list empty or omit it, the cluster ends up non functional.

  source  = "terraform-aws-modules/eks/aws"
  version = "20.31.6"
  cluster_version = "1.31"

  ...
  cluster_compute_config = {
    enabled    = true
    node_pools = ["system", "general-purpose"]
  }

@erezo9
Copy link

erezo9 commented Jan 16, 2025

@andromeda306
if you are using default node pools - dont assign access policy
if you dont use default node pools - assign access policy
if you create a cluster with default node pools and no custom and then remove the default node pool - dont assign access policy

@andromeda306
Copy link

andromeda306 commented Jan 16, 2025 via email

@erezo9
Copy link

erezo9 commented Jan 16, 2025

@andromeda306
well the nodepool is for your node size and type for example t2large,t2xlarge
the nodeclass is what image you run or what role you want

the best option - is to create your own role besides the module it self
and when you create the nodeclass you should use the role you created

just put - create node iam role false in the module so it wont be created

@andromeda306
Copy link

andromeda306 commented Jan 16, 2025 via email

@bryantbiggs
Copy link
Member

bryantbiggs commented Jan 16, 2025

There isn't anything to customize within this module with respect to Auto Mode. The API gives you the ability to:

  1. Enable or disable Auto Mode (disabling after enabling is problematic due to the upstream which needs to be fixed Disable EKS auto mode fails #3273)
  2. Enable the built-in nodepools for system and/or general-purpose (and this requires a role when you opt into these, so we create a role for it, or you can provide your own role)

Thats it. Any customizations of sorts will come by users providing their own node classes and node pools, and with that, they will need to supply the IAM role that will be used by the nodes created.

tl;dr - there isn't anything to customize in this model for Auto Mode, its just an opt-in, or opt-out. Customizations will happen through users implementing their own node pools and node classes which are not managed by this module

@andromeda306
Copy link

I was able to sort this out in the end - thanks for the push in the right direction.
It's important to be aware that without an actual workload there aren't any visible nodes (proves Karpenter is running in the control plane?). Also double check that once you've filled out the spec for the NodeClass and NodePool, they are actually successfully created on the cluster (check the respective event logs) - only then will new nodes be able to be created when a workload is scheduled. It's also worth taking a look at the default NodeClass and NodePool as they demonstrate that some of the spec syntax differs from a typical Karpenter deployment (as has been mentioned before).

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  ...
  cluster_compute_config = {
    enabled    = true
  }
}

resource "aws_eks_access_entry" "auto_mode" {
  cluster_name  = module.eks.cluster_name
  principal_arn = module.eks.node_iam_role_arn
  type          = "EC2"
}

resource "aws_eks_access_policy_association" "auto_mode" {
  cluster_name  = module.eks.cluster_name
  policy_arn    = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSAutoNodePolicy"
  principal_arn = module.eks.node_iam_role_arn

  access_scope {
    type = "cluster"
  }
}

resource "kubectl_manifest" "karpenter_node_class" {
  yaml_body = <<-YAML
    apiVersion: eks.amazonaws.com/v1
    kind: NodeClass
    metadata:
      name: custom
      labels:
        app.kubernetes.io/managed-by: eks
    spec:
      role: ${module.eks.node_iam_role_name}
    ...
  YAML
}

resource "kubectl_manifest" "karpenter_node_pool" {
  yaml_body = <<-YAML
    apiVersion: karpenter.sh/v1
    kind: NodePool
    metadata:
      name: custom
      labels:
        app.kubernetes.io/managed-by: eks
    spec:
      template:
        spec:
          nodeClassRef:
            group: eks.amazonaws.com
            kind: NodeClass
            name: custom
          requirements:
          ...
  YAML
}

@erezo9
Copy link

erezo9 commented Jan 17, 2025

@andromeda306
I agree but that is the aws behavior at the end ant this module just implements the IAC
if you create a cluster today via the web console, you wont have nodes either

@gagemillerlob
Copy link

I know this issue is closed now but it would be nice if the access entry configuration was officially documented somewhere like the example directory. My org has tagging requirements on all EC2 instances so the built in node pool and node class can't be enabled. This configuration is very specific to EKS auto given the EC2 type requirement and the AmazonEKSAutoNodePolicy policy.

Also in case anyone else runs into this: do not modify the built in nodeclass or nodepools. Your config will appear to work until it is overwritten by AWS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants