Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix CNI plugin configuration issue #3087

Merged
merged 1 commit into from
Jan 11, 2024

Conversation

Tamas-Biro1
Copy link
Contributor

@Tamas-Biro1 Tamas-Biro1 commented Jan 5, 2024

Description

This PR fixes the configuration issue introduced with #2708.
The problem: if a pod starts when sysctl tuning is configured it will fail with:

 Warning  FailedCreatePodSandBox  9s                      kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "88e1a46f3075be984d7761de03df288697e9236858b961fa5844076e1b66cc43": plugin type="tuning" failed (add): failed to load netconf: json: cannot unmarshal array into Go struct field TuningConf.sysctl of type map[string]string

Reason: the format of the sysctl should look like map[string]string:

  "sysctl": {
          "net.core.somaxconn": "500",
          "net.ipv4.conf.IFNAME.arp_filter": "1"
  }

as it described here: https://www.cni.dev/plugins/current/meta/tuning/#system-controls-operation
I will attach the proof of integration testing (on a real cluster) of this PR soon.

For PR author

  • Tests for change.
  • If changing pkg/apis/, run make gen-files
  • If changing versions, run make gen-versions

For PR reviewers

A note for code reviewers - all pull requests must have the following:

  • Milestone set according to targeted release.
  • Appropriate labels:
    • kind/bug if this is a bugfix.
    • kind/enhancement if this is a a new feature.
    • enterprise if this PR applies to Calico Enterprise only.

@tmjd
Copy link
Member

tmjd commented Jan 8, 2024

This look pretty good. Are you working on testing this out in a real cluster? Hopefully you can build the image with make image and then you can tag and push the image to docker and then use that operator image on a cluster to test it out.

@Tamas-Biro1
Copy link
Contributor Author

Tamas-Biro1 commented Jan 8, 2024

This look pretty good. Are you working on testing this out in a real cluster? Hopefully you can build the image with make image and then you can tag and push the image to docker and then use that operator image on a cluster to test it out.

Yes, I tested it out on a real cluster in our cloud environment, and the image was built by make image as you suggested.

Note: as the v1.33.0 has not been released yet (which contains the #2708 preceding PR), I tested these changes with v1.30.9. But I expect the same test results on the other branches as well.

Here are the results:

the cni-config configmap has the proper CNI configuration data:

kubectl get cm cni-config -ncalico-system -oyaml
apiVersion: v1
data:
  config: "{\n\t\t\t  \"name\": \"k8s-pod-network\",\n\t\t\t  \"cniVersion\": \"0.3.1\",\n\t\t\t
    \ \"plugins\": [{\"container_settings\":{\"allow_ip_forwarding\":true},\"datastore_type\":\"kubernetes\",\"ipam\":{\"assign_ipv4\":\"true\",\"assign_ipv6\":\"false\",\"type\":\"calico-ipam\"},\"kubernetes\":{\"k8s_api_root\":\"https://172.21.0.1:443\",\"kubeconfig\":\"__KUBECONFIG_FILEPATH__\"},\"log_file_max_age\":30,\"log_file_max_count\":10,\"log_file_max_size\":100,\"log_file_path\":\"/var/log/calico/cni/cni.log\",\"log_level\":\"Info\",\"mtu\":1480,\"nodename_file_optional\":false,\"policy\":{\"type\":\"k8s\"},\"type\":\"calico\"},{\"capabilities\":{\"bandwidth\":true},\"type\":\"bandwidth\"},{\"capabilities\":{\"portMappings\":true},\"snat\":true,\"type\":\"portmap\"},{\"sysctl\":{\"net.ipv4.tcp_keepalive_intvl\":\"15\",\"net.ipv4.tcp_keepalive_probes\":\"6\",\"net.ipv4.tcp_keepalive_time\":\"40\"},\"type\":\"tuning\"}]
    \n\t\t\t}"
kind: ConfigMap
metadata:
  creationTimestamp: "2024-01-08T18:51:27Z"
  name: cni-config
  namespace: calico-system
  ownerReferences:
  - apiVersion: operator.tigera.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: Installation
    name: default
    uid: 2b40b4ed-aa58-4dd5-904e-b78a15f6908c
  resourceVersion: "3715"
  uid: 9600b762-e549-4ac3-b38a-54c5aa7cbba3

all Calico pods are running,

NAMESPACE         NAME                                                   READY   STATUS      RESTARTS   AGE
calico-system     calico-kube-controllers-f76bd7684-pst82                1/1     Running     0          25m
calico-system     calico-node-m79ls                                      1/1     Running     0          22m
calico-system     calico-node-p4rng                                      1/1     Running     0          21m
calico-system     calico-node-rhwsk                                      1/1     Running     0          22m
calico-system     calico-typha-764c7f559-6bhjf                           1/1     Running     0          25m
calico-system     calico-typha-764c7f559-cf4m9                           1/1     Running     0          22m
default           alpine                                                 1/1     Running     0          5s
...
tigera-operator   tigera-operator-79cbc78c96-kzrvh                       1/1     Running     0          25m

the desired sysctl parameters are set in the pod,

k exec -it alpine -- sh
/ # sysctl -a  | grep keepalive
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 6
net.ipv4.tcp_keepalive_time = 40

@Tamas-Biro1 Tamas-Biro1 marked this pull request as ready for review January 8, 2024 19:35
@Tamas-Biro1 Tamas-Biro1 requested a review from a team as a code owner January 8, 2024 19:35
@tmjd
Copy link
Member

tmjd commented Jan 8, 2024

/sem-approve

Copy link
Member

@tmjd tmjd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tmjd
Copy link
Member

tmjd commented Jan 9, 2024

There was an error in the CI that was unrelated, I've gotten the update needed into the master branch. Could you rebase your branch to pick up that change and it should allow CI to run successfully.

@Tamas-Biro1 Tamas-Biro1 force-pushed the fix-tuning-plugin-configuration branch from 0f8419b to 1d839d4 Compare January 9, 2024 17:02
@Tamas-Biro1
Copy link
Contributor Author

There was an error in the CI that was unrelated, I've gotten the update needed into the master branch. Could you rebase your branch to pick up that change and it should allow CI to run successfully.

thanks, rebased

@tmjd
Copy link
Member

tmjd commented Jan 11, 2024

/sem-approve

@tmjd tmjd merged commit 41502d7 into tigera:master Jan 11, 2024
5 checks passed
@danudey danudey modified the milestones: v1.33.0, v1.34.0 Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants