-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial BGP sync during kube-router startup extremely slow in kubernetes v1.29 #1668
Comments
I don't know if it's of any concern but checking some kube-router pods
and
|
Unfortunately I can't think of any reason why Kubernetes version would affect BGP convergence. In the past, the only reasons why we've ever seen this type of thing are either:
The fact that it took 5 minutes seems very suspiciously like the default BGP Graceful Restart deferral time: https://github.com/cloudnativelabs/kube-router/blob/v1.4/pkg/options/options.go#L80 (granted that one is technically 6 minutes). As an aside, I would not run kube-router v1.4.0 against a 1.29 Kubernetes cluster. The client libraries it is built against are Kubernetes v1.22 which is well outside the 2 minor version discrepancy that the upstream project allows. The fact that k8s has such good backwards compatibility is a testament that it is able to run at all. Beyond this, v1.4.0 is more than 2 years behind active development and the project won't be able to help further on this issue. |
Thanks for your response @aauren! I think I found out that this issue actually is due to changes made in Kubernetes v1.29 and it happens when you run with Anyway, digging further I noticed that it kind of looked like the new node was added as valid bgp neighbors once the network route controller run a sync(?) (default every 5min). I assume nrc would re-read all nodes from the API on sync start. I started looking into the changes made in kubelet for 1.29 and noticed kubernetes/kubernetes#121028, I have yet to confirm this, but it seems like you can get the old behavior back (not waiting for the external cloud controller to populate the address list) by passing I'm going to try passing --node-ip to kubelet and see if it helps, just wanted to report back in case some one else runs into this in later versions as the issue may persist there as well. edit: some clarifications, etc. |
Thanks for reporting back about what you found! I think this will be helpful for future people running into this issue. Please let the thread know how the addition of the node-ip parameter changes things if at all. The issue that we have open here #676 is specific to node annotations, but part of that work will be to watch nodes in general. Maybe we could watch for address updates as well whenever we work on that. |
I've verified that passing
instead of
In AWS it's possible to get the internal ip via the IMDS while bootstrapping the node and pass something like # -- 8< -- 8< --
nodeRegistration:
name: "$LOCAL_HOSTNAME"
kubeletExtraArgs:
cloud-provider: "$CLOUD_PROVIDER"
cloud-config: "$CLOUD_CONFIG"
cluster-dns: 169.254.20.10
node-ip: "$IPADDR"
# -- 8< -- 8< -- (above variables obviously needs to be expanded prior use) edit: clarifications, some example |
This appears to be related to an ongoing upstream conversation: kubernetes/kubernetes#125348 |
What happened?
After cluster was upgraded to v1.29, from v1.28 we started observing increased startup times on workloads due to network not being available for ~5min. Checking logs in kube-router we saw that the initial bgp sync takes several minutes instead of just a few seconds (at most). While the sync is running no running pods can use the network until
time="2024-05-10T10:13:30Z" level=info msg="sync finished" Topic=Server
is reportedWhat did you expect to happen?
Initial BGP sync should complete in the same amount of time as in Kubernetes v1.28
How can we reproduce the behavior you experienced?
Steps to reproduce the behavior:
**Screenshots / Architecture Diagrams / Network Topologies **
If applicable, add those here to help explain your problem.
** System Information (please complete the following information):**
kube-router --version
): 1.4.0kubectl version
) : v1.29.4** Logs, other output, metrics **
Additional context
two kubeadm kube-controller-manager settings was changed during upgrade:
node-cidr-mask-size: "25"
from the default/24
maxPods: 64
We're aware 1.4.0 is pretty old, but the issue seems to be somewhere in how bgp is handled? or peers discovered, we've tried to pin point any other changes in our infrastructure but haven't been able to come up with anything other than the changed kubernetes version
The text was updated successfully, but these errors were encountered: