Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gavinbunney/kubectl provider used in vpc-cni-custom-networking is behaving inconsistently #1675

Closed
fbozic opened this issue Jul 6, 2023 · 11 comments · Fixed by #1901
Closed
Assignees
Labels
bug Something isn't working dependencies Pull requests that update a dependency file

Comments

@fbozic
Copy link

fbozic commented Jul 6, 2023

Description

There is a bug in gavinbunney/kubectl provider which is used in vpc-cni-custom-networking example. The bug seems to be happening only on K8s version 1.27. Also, it looks like that provider is not maintained anymore (example 1, example 2, example 3).

I was following vpc-cni-custom-networking example and I've made no changes to the relevant code, gavinbunney/kubectl provider and kubectl_manifest resources.

I've created a new EKS v1.27 cluster with terraform apply successfully. Then I ran kubectl get eniconfigs.crd.k8s.amazonaws.com to inspect created ENIConfigs and the manifests looked okay. Then I ran terraform plan to check if terraform state is in sync. Terraform state was not in sync, terraform wanted to create those ENIConfigs. I've attached terraform debug logs which state that ENIConfigs resources are being removed from the state because they couldn't be found. Trying to import back those resources fails. After some time terraform plan will produce no changes again (~20,30 minutes after the last interaction with eniconfigs.crd.k8s.amazonaws.com over kubectl).

I've found a similar solution which is using hashicorp/helm provider and an empty generic helm chart. I can open PR with this change for vpc-cni-custom-networking example if it makes sense to switch providers.

Old code:

resource "kubectl_manifest" "eni_config" {
  for_each = zipmap(var.azs, var.pods_subnet_ids)

  yaml_body = yamlencode({
    apiVersion = "crd.k8s.amazonaws.com/v1alpha1"
    kind       = "ENIConfig"
    metadata = {
      name = each.key
    }
    spec = {
      securityGroups = [
        module.eks.cluster_primary_security_group_id,
        module.eks.node_security_group_id,
      ]
      subnet = each.value
    }
  })
}

New code:

resource "helm_release" "eni_configs" {
  name       = "eni-configs"
  repository = "https://dysnix.github.io/charts"
  chart      = "raw"
  version    = "v0.3.2"
  values = [
    yamlencode({
      resources = [
        for az, subnet_id in zipmap(var.azs, var.pods_subnet_ids) :
        {
          apiVersion = "crd.k8s.amazonaws.com/v1alpha1"
          kind       = "ENIConfig"
          metadata = {
            name = az
          }
          spec = {
            securityGroups = [
              module.eks.cluster_primary_security_group_id,
              module.eks.node_security_group_id,
            ]
            subnet = subnet_id
          }
        }
      ]
    })
  ]
}
  • [x ] ✋ I have searched the open/closed issues and my issue is not listed.

Versions

Terraform v1.5.2
on darwin_arm64
  • Provider version(s):
Terraform v1.5.2
on darwin_arm64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/aws v5.3.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.3.2
+ provider registry.terraform.io/hashicorp/helm v2.10.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.21.1
+ provider registry.terraform.io/hashicorp/time v0.9.1
+ provider registry.terraform.io/hashicorp/tls v4.0.4

Reproduction Code [Required]

  • terraform apply
  • kubectl get eniconfigs.crd.k8s.amazonaws.com
  • terraform plan -> kubectl resources are dropped from the state

Expected behaviour

terraform plan is consistent every time and ENIConfigs do not disappear from the state.

Actual behaviour

terraform plan is not consistent every time and ENIConfigs disappear from the state.

Terminal Output Screenshot(s)

Terraform plan debug logs (just the interesting part):

2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1b) not found, removing from state
2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1c) not found, removing from state
2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1a) not found, removing from state
2023-07-04T18:42:27.207+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"] during refresh.
2023-07-04T18:42:27.207+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"] during refresh.
2023-07-04T18:42:27.207+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"] during refresh.
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1b Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1b] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-073fc7974ff152aa8]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1c Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1c] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0200bdef6e567c74f]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1b Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1b] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-073fc7974ff152aa8]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1c Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1c] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0200bdef6e567c74f]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1a Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1a] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0c3c1b5745dbdad8c]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1a Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1a] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0c3c1b5745dbdad8c]]
2023-07-04T18:42:27.210+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.210+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.210+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.211+0200 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/gavinbunney/kubectl/1.14.0/darwin_arm64/terraform-provider-kubectl_v1.14.0 pid=16928
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1c\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"]
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1a\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"]
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1b\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"]

Additional context

@fbozic fbozic changed the title gavinbunney/kubectl provider used in vpc-cni-custom-networking is behaving inconstantly gavinbunney/kubectl provider used in vpc-cni-custom-networking is behaving inconsistently Jul 6, 2023
@FernandoMiguel
Copy link
Contributor

I too was severely hit by this.
the recommended approach is to move away from that provider and instead use manifests.
i tried but had trouble to get ENIConfig to deploy

@askulkarni2 askulkarni2 added bug Something isn't working upstream issue labels Jul 6, 2023
@fbozic
Copy link
Author

fbozic commented Jul 10, 2023

I too was severely hit by this.
the recommended approach is to move away from that provider and instead use manifests.
i tried but had trouble to get ENIConfig to deploy

Have you tried helm_release? I've tested it on a new cluster and everything works as expected. Unlike official kubernetes, helm provider doesn't require CRDs to be available during plan time.

If this is an okay approach I would be happy to make a PR with changes.

@alekc
Copy link

alekc commented Jul 22, 2023

helm_release is somewhat limited (you either go with any-resource helm chart or you face the limitations around templating) and has some other bugs linked to it as well (sporadic random changes for example). Original linked issue has been adressed in my fork (https://github.com/alekc/terraform-provider-kubectl/releases/tag/v2.0.2), you can replace provider definition and have a go, it should be completely compatible (and as a side effect several other bugs like wait for deletion has been fixed and released as well).

@askulkarni2
Copy link
Contributor

askulkarni2 commented Oct 4, 2023

@fbozic I've been experimenting with helm_release and it has worked out quite well. There might be hidden issues I have not uncovered but so far so good. I am inclined to removing kubectl and replacing with helm_release and a local chart. Let me know if you are still open to a PR with a caveat that it would have to done across the board for all examples we have today in this repo.

@fbozic
Copy link
Author

fbozic commented Oct 5, 2023

@askulkarni2 I'll try to open a PR next week once I'm back from PTO.

@barryib
Copy link

barryib commented Oct 26, 2023

Just wondering... Anything preventing to use the kubernetes_manifest from the official kubernetes provider https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/manifest ? 🤔

@fbozic
Copy link
Author

fbozic commented Oct 26, 2023

Just wondering... Anything preventing to use the kubernetes_manifest from the official kubernetes provider https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/manifest ? 🤔

CRD has to be on a cluster during the terraform plan. That is not the case with ENIConfig during the first terraform run.

@barryib
Copy link

barryib commented Oct 26, 2023

Oh right. Thanks @fbozic.

@bryantbiggs
Copy link
Contributor

We will be working on updating the examples provided here shortly. The approach we will be using is as follows:

  • If the manifest is used simply as a demonstration, it will be converted into standard YAML file and the pattern's instruction updated to reflect a kubectl apply -f <file>.yaml
  • If the manifest is required for the pattern, it will be converted into a local Helm chart so that the helm_release resource can be utilized to provision. To be fair, this is a workaround, but it has shown to work fairly well and it fits in with the EKS Blueprints module which deploys via the helm_release resource

@bryantbiggs bryantbiggs added dependencies Pull requests that update a dependency file and removed upstream issue labels Oct 26, 2023
@fbozic
Copy link
Author

fbozic commented Oct 27, 2023

@bryantbiggs I've started working on a fix for this in the forked repo. Are you going to pick this up and should I stop working on that?

@zack-is-cool
Copy link

zack-is-cool commented Nov 17, 2023

We ended up just applying the cni configs in the inputs for the EKS addon
https://github.com/defenseunicorns/terraform-aws-eks/blob/main/main.tf#L79-L121

additional logic here to create them conditionally but the idea is you can feed them into cluster_addons.vpc_cni.configuration_values.eniconfig instead of doing it from terraform

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working dependencies Pull requests that update a dependency file
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

7 participants