-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS problem for k3s multicloud cluster #10900
Comments
This indicates that the wireguard mesh between nodes isn't functioning properly, and DNS traffic between the affected node, and the node running the coredns pod, is being dropped. Ensure that you've opened all the correct ports for wireguard, and that you have node external-IPs set correctly for wireguard to correctly establish the mesh between nodes. |
Hi, @brandond , thank you for the answer. Here is my cluster state:
The main node I'm checking connectivity, according the page - https://docs.k3s.io/installation/requirements#networking. From From Here is my config for
Here is my config for
TBH, not sure in which direction should I go at this point, so any suggestions are welcome. |
@manuelbuil do you have any tips on how to check wireguard connectivity between nodes? |
@allnightlong could you run the following commands: 1 - Install wireguard-tools and then execute |
Hi, @manuelbuil, thank you for the answers, here is my cluster's state:
146.185.xxx.xxx - is a
if I run
|
Run those tests on all the nodes. You need full connectivity between all cluster members, since the coredns pod may run on any node. |
you are right, @brandond ,
|
I think, I've figured out the problem. It was combination of 2 factors:
My expectations were, that connectivity should be established only between any agent node and server node. And Another expectation was, that all In this situation my only request would be to make documentation more clear about that, as I've spent quite some time, trying figuring out the problem. And I didn't found any config option to move all |
Great that you found the problem! Thanks for taking the effort
We can add mor information in the docs but right now it is stated that |
thank you, for clearing things out for me! |
As Manuel (and the docs) said, wireguard is a full mesh. What you're asking for is closer to what tailscale does. If you want something more like a star/hub-and-spoke, you should look into using tailscale. This is covered in the docs.
I'm curious where this expectation came from. There is nothing special about pods in the kube-system namespace, they will run on any available node in the cluster, same as any other pod. |
In my setup, I've already run into a problem, when That's why I don't want any of the system-important pods to run anywhere, but core.
but I don't know, how to force this |
|
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions. |
Discussed in #10897
Originally posted by allnightlong September 15, 2024
I'm building my cluster with nodes from different datacenters. Actually, the cluster has lived in one dc for some time with 5 node (1 server + 4 agents). Now I'm adding new node in different dc.
Using this tutorial as an example https://docs.k3s.io/networking/distributed-multicloud#embedded-k3s-multicloud-solution
For server:
For agent:
The problem is non of the agent's pod can resolve any hostname. I'm using official dns resolution guide https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ and
nslookup
is failing both for internal and external requests.Internal:
External:
External with cloudflare's dns:
What could be the problem of this DNS issue? How could I resolve it?
P.S. I'm using k3s version
v1.30.4+k3s1
(latest at the time of writing) both for server and agents.The text was updated successfully, but these errors were encountered: