Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a shorter timeout instead of default 30 seconds for kindnetd listing nodes to speed up control-plane ready #2036

Closed
llhuii opened this issue Jan 26, 2021 · 16 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@llhuii
Copy link

llhuii commented Jan 26, 2021

What would you like to be added:
Add a timeout for

nodes, err = clientset.CoreV1().Nodes().List(context.TODO(), metav1.ListOptions{})

Else kindnet would be timed out for 30 seconds in first time leading 30-seconds slower for control-plane node to be ready
Why is this needed:
I have a ci using kind to deploy k8s environment, I want to make control-plane node to be ready quickly.

Environment:

  • kind version: (use kind version): kind v0.10.0-alpha go1.15.6 linux/amd64
@llhuii llhuii added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 26, 2021
@llhuii
Copy link
Author

llhuii commented Jan 26, 2021

Hi, @BenTheElder, if this's ok to you, I would be happy to submit a pull request for this.

@aojea
Copy link
Contributor

aojea commented Jan 26, 2021

@llhuii what do you mean exactly, that if the control-plane takes more than 30 seconds in start this will add an additional 30 seconds?

@llhuii
Copy link
Author

llhuii commented Jan 27, 2021

@llhuii what do you mean exactly, that if the control-plane takes more than 30 seconds in start this will add an additional 30 seconds?

Hi, @aojea, because the default i/o timeout is 30 seconds, here wants to make it shorter(e.g. 1 seconds), and ListOptions has a TimeoutSeconds option.
kindnet logs:
I0127 02:08:57.810964 1 main.go:111] Failed to get nodes, retrying after error: Get https://10.96.0.1:443/api/v1/nodes: dial tcp 10.96.0.1:443: i/o timeout
This timeout occurs very frequently in my environment.

@llhuii
Copy link
Author

llhuii commented Jan 27, 2021

  1. kindnetd and kubeproxy are created simultaneous by kubelet.
  2. kindnetd tries to connect API server before the iptable rules are set by kubeproxy, so it will be stuck at default socket io timeout (which is 30 seconds, PS I'm not sure that this 30 timeout is set by client-go or golang net).

--update--
This 30s timeout is set default by golang net/http package, code: https://github.com/golang/go/blob/c20bcb64882d1134770683d663ee9f82fea715e6/src/net/http/transport.go#L45

var DefaultTransport RoundTripper = &Transport{
	Proxy: ProxyFromEnvironment,
	DialContext: (&net.Dialer{
		Timeout:   30 * time.Second,
		KeepAlive: 30 * time.Second,
	}).DialContext,
	ForceAttemptHTTP2:     true,
	MaxIdleConns:          100,
	IdleConnTimeout:       90 * time.Second,
	TLSHandshakeTimeout:   10 * time.Second,
	ExpectContinueTimeout: 1 * time.Second,
}

@llhuii llhuii changed the title Add a timeout in kindnet listing nodes firstly to speed up control-plane be ready Add a shorter timeout instead of default 30 seconds for kindnetd listing nodes to speed up control-plane ready Jan 27, 2021
@llhuii
Copy link
Author

llhuii commented Jan 27, 2021

Another workaround I found is to add a nc initContainer into default-cni.yaml to test k8s API:

cat > config.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  # the default CNI will not be installed
  #disableDefaultCNI: true
nodes:
- role: control-plane
  # add a mount from /path/to/my/files on the host to /files on the node
  extraMounts:
  - hostPath: /tmp/default-cni.yaml
    containerPath: /kind/manifests/default-cni.yaml
EOF

cat > /tmp/default-cni.yaml <EOF
# kindnetd networking manifest
# would you kindly template this file
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kindnet
rules:
  - apiGroups:
    - policy
    resources:
    - podsecuritypolicies
    verbs:
    - use
    resourceNames:
    - kindnet
  - apiGroups:
      - ""
    resources:
      - nodes
    verbs:
      - list
      - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kindnet
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kindnet
subjects:
- kind: ServiceAccount
  name: kindnet
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kindnet
  namespace: kube-system
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kindnet
  namespace: kube-system
  labels:
    tier: node
    app: kindnet
    k8s-app: kindnet
spec:
  selector:
    matchLabels:
      app: kindnet
  template:
    metadata:
      labels:
        tier: node
        app: kindnet
        k8s-app: kindnet
    spec:
      hostNetwork: true
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: kindnet
      initContainers:
      - name: check-kube-proxy
        image: busybox
        command: ["sh"]
        args:
        - "-c"
        - "echo before $(date); while ! timeout 1 nc -z $KUBERNETES_SERVICE_HOST $KUBERNETES_SERVICE_PORT_HTTPS; do date; done; echo done $(date)"
      containers:
      - name: kindnet-cni
        image: kindest/kindnetd:v20210119-d5ef916d
        env:
        - name: HOST_IP
          valueFrom:
            fieldRef:
              fieldPath: status.hostIP
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: POD_SUBNET
          value: {{ .PodSubnet }}
        volumeMounts:
        - name: cni-cfg
          mountPath: /etc/cni/net.d
        - name: xtables-lock
          mountPath: /run/xtables.lock
          readOnly: false
        - name: lib-modules
          mountPath: /lib/modules
          readOnly: true
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_RAW", "NET_ADMIN"]
      volumes:
      - name: cni-cfg
        hostPath:
          path: /etc/cni/net.d
      - name: xtables-lock
        hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
      - name: lib-modules
        hostPath:
          path: /lib/modules
---

EOF

kind create cluster --config=config.yaml

@llhuii
Copy link
Author

llhuii commented Jan 27, 2021

Also another way to cheat to speed up node ready:

kind create cluster --name test
echo '
{
        "cniVersion": "0.3.1",
        "name": "kindnet",
        "plugins": [ 
            { "type": "portmap" }
        ]
}
' | docker exec -i test-control-plane  tee  /etc/cni/net.d/10-kindnet.conflist

@aojea
Copy link
Contributor

aojea commented Jan 27, 2021

I see, thanks, so there are 3 points of improvement:

  1. kube-proxy dependency, so kindnet depends on the apiserver virtual api, that has to be configured by kube-proxy, if kindnet is deployed before kube-proxy it has to timeout, restart and then it can use the apiserver virtual ip
    I think that we should implement the same logic that kube-proxy, allow to configure the apiserver host so we can use the apiserver-advertise-address, and fallback to incluster config if it is not present
    https://github.com/kubernetes/kubernetes/blob/5310e4f30e212a3d58b37fd07633c3b249627b53/cmd/kube-proxy/app/server.go#L542-L557
// createClients creates a kube client and an event client from the given config and masterOverride.
// TODO remove masterOverride when CLI flags are removed.
func createClients(config componentbaseconfig.ClientConnectionConfiguration, masterOverride string) (clientset.Interface, v1core.EventsGetter, error) {
	var kubeConfig *rest.Config
	var err error

	if len(config.Kubeconfig) == 0 && len(masterOverride) == 0 {
		klog.Info("Neither kubeconfig file nor master URL was specified. Falling back to in-cluster config.")
		kubeConfig, err = rest.InClusterConfig()
	} else {
		// This creates a client, first loading any specified kubeconfig
		// file, and then overriding the Master flag, if non-empty.
		kubeConfig, err = clientcmd.NewNonInteractiveDeferredLoadingClientConfig(
			&clientcmd.ClientConfigLoadingRules{ExplicitPath: config.Kubeconfig},
			&clientcmd.ConfigOverrides{ClusterInfo: clientcmdapi.Cluster{Server: masterOverride}}).ClientConfig()
	}
  1. kubelet "network ready", this has to be done by kind when configuring the node, but we have to be friendly with other CNIs and only add the placeholder if the DisableDefaultCNI is fale, because containerd by default only will load one cni config file
    https://github.com/containerd/cri/blob/master/docs/config.md#cri-plugin-config-guide
  # 'plugins."io.containerd.grpc.v1.cri".cni' contains config related to cni
  [plugins."io.containerd.grpc.v1.cri".cni]
    # bin_dir is the directory in which the binaries for the plugin is kept.
    bin_dir = "/opt/cni/bin"

    # conf_dir is the directory in which the admin places a CNI conf.
    conf_dir = "/etc/cni/net.d"

    # max_conf_num specifies the maximum number of CNI plugin config files to
    # load from the CNI config directory. By default, only 1 CNI plugin config
    # file will be loaded. If you want to load multiple CNI plugin config files
    # set max_conf_num to the number desired. Setting max_config_num to 0 is
    # interpreted as no limit is desired and will result in all CNI plugin
    # config files being loaded from the CNI config directory.
    max_conf_num = 1
  1. implement the TODO and make it a real controller
    // Gets the Nodes information from the API
    // TODO: use a proper controller instead

I have started with 3 and I have to do some changes in kindnet for dual stack , so I will take 1 and 3

If you want, you can take 2. and drop an empty cniconfig if kind is going to use kindnetd

@aojea aojea added this to the 1.0 milestone Jan 27, 2021
@aojea aojea added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jan 27, 2021
@llhuii
Copy link
Author

llhuii commented Jan 27, 2021

Ok, I am glad to take 2.

@llhuii
Copy link
Author

llhuii commented Jan 27, 2021

I found two ways to implement the 2nd improvement:

  1. write the cni placeholder to /etc/cni/net.d/10-kindnet.conflist before this line
    // setup nodes reconcile function, closes over arguments
    reconcileNodes := makeNodesReconciler(cniConfigWriter, hostIP)
  2. add the placeholder by node.Command before this line
    // install the manifest
    if err := node.Command(
    "kubectl", "create", "--kubeconfig=/etc/kubernetes/admin.conf",
    "-f", "-",
    ).SetStdin(strings.NewReader(manifest)).Run(); err != nil {

I prefer way 1 because only kindnet knows what's filename and content the cni plugin would be,
though it wouble be slower to reach node-ready than way 2

Which one do you prefer? Or any other better ways?

@aojea
Copy link
Contributor

aojea commented Jan 27, 2021

If we wait for kindnet we still have to wait for kubelet to create the kindnet pod.
If the goal is that kubelet set it status to ready ASAP it has to be 2,
what do you think @BenTheElder ?

@llhuii
Copy link
Author

llhuii commented Jan 31, 2021

If we wait for kindnet we still have to wait for kubelet to create the kindnet pod.
If the goal is that kubelet set it status to ready ASAP it has to be 2,
what do you think @BenTheElder ?

#2048 for this

@BenTheElder
Copy link
Member

CNI placeholder does not seem to perform well, however bypassing the kube-apiserver VIP for the real endpoint is in #2043, note that it is not in a shipping node image yet, only when building a new node image will this be present.

@BenTheElder BenTheElder added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Feb 21, 2021
@aojea
Copy link
Contributor

aojea commented Mar 11, 2021

should we close this @BenTheElder with #2043

@BenTheElder
Copy link
Member

#2119 is what I was looking for to close this.

@llhuii
Copy link
Author

llhuii commented Mar 12, 2021

Thank you, I will try it when free.

@llhuii
Copy link
Author

llhuii commented Mar 18, 2021

Tested for the version kind v0.11.0-alpha:

  1. T: node created
  2. T+27 seconds: node ready
  3. T+42 seconds: coredns ready

and the version kind v0.10.0:

  1. T: node created
  2. T+51 seconds: node ready
  3. T+64 seconds: coredns ready

So shortened 24 seconds 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

No branches or pull requests

3 participants