Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolution failed when there is app Pod and the coredns Pod on the same node #60

Open
tsubasaxZZZ opened this issue Jun 16, 2020 · 5 comments

Comments

@tsubasaxZZZ
Copy link

I need your help. I finished all process the hardway, and I'm checking name resolution from Pods.
Then I realized a problem that failed name resolution.

Following indicated Pod deployment:

  • coredns-59845f77f8-w26gc Pod is on worker-1
  • util3 is on worker-1
  • util4 is on worker-0
kuberoot@controller-0:~/cilium-master$ k get po -o wide -A
NAMESPACE              NAME                                         READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
default                busybox                                      1/1     Running   10         99d     10.200.0.19   worker-0   <none>           <none>
default                busybox2                                     1/1     Running   1          114m    10.200.1.17   worker-1   <none>           <none>
default                nginx                                        1/1     Running   3          99d     10.200.0.17   worker-0   <none>           <none>
default                sample-2pod                                  2/2     Running   2          94d     10.200.0.18   worker-0   <none>           <none>
default                sample-pod                                   1/1     Running   2          95d     10.200.1.14   worker-1   <none>           <none>
default                ubuntu1                                      1/1     Running   1          112m    10.200.0.20   worker-0   <none>           <none>
default                ubuntu2                                      1/1     Running   1          112m    10.200.1.18   worker-1   <none>           <none>
default                util                                         1/1     Running   0          5h27m   10.200.1.16   worker-1   <none>           <none>
default                util2                                        1/1     Running   0          92m     10.200.0.21   worker-0   <none>           <none>
default                util3                                        1/1     Running   0          70m     10.200.1.19   worker-1   <none>           <none>
default                util4                                        1/1     Running   0          70m     10.200.0.22   worker-0   <none>           <none>
kube-system            coredns-59845f77f8-w26gc                     1/1     Running   0          22m     10.200.1.20   worker-1   <none>           <none>
kubernetes-dashboard   dashboard-metrics-scraper-7b8b58dc8b-m78x4   1/1     Running   3          99d     10.200.0.16   worker-0   <none>           <none>
kubernetes-dashboard   kubernetes-dashboard-866f987876-dxm4c        1/1     Running   9          99d     10.200.1.15   worker-1   <none>           <none>
kuberoot@controller-0:~/cilium-master$
kuberoot@controller-0:~/cilium-master$ k get svc -o wide -A
NAMESPACE              NAME                        TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)                  AGE    SELECTOR
default                busybox                     ClusterIP   10.32.0.33    <none>        80/TCP                   117m   run=busybox
default                kubernetes                  ClusterIP   10.32.0.1     <none>        443/TCP                  99d    <none>
default                nginx                       NodePort    10.32.0.41    <none>        80:30557/TCP             99d    run=nginx
default                util                        ClusterIP   10.32.0.202   <none>        80/TCP                   123m   run=util
kube-system            kube-dns                    ClusterIP   10.32.0.10    <none>        53/UDP,53/TCP,9153/TCP   99d    k8s-app=kube-dns
kubernetes-dashboard   dashboard-metrics-scraper   ClusterIP   10.32.0.102   <none>        8000/TCP                 99d    k8s-app=dashboard-metrics-scraper
kubernetes-dashboard   kubernetes-dashboard        ClusterIP   10.32.0.135   <none>        443/TCP                  99d    k8s-app=kubernetes-dashboard

Senario 1 - try name resolve from util3(coredns Pod is same node):

$ k exec -it util3 -- nslookup www.bing.com
;; reply from unexpected source: 10.200.1.20#53, expected 10.32.0.10#53
;; reply from unexpected source: 10.200.1.20#53, expected 10.32.0.10#53
;; reply from unexpected source: 10.200.1.20#53, expected 10.32.0.10#53
;; connection timed out; no servers could be reached

command terminated with exit code 1

Senario 2 - try name resolve from util4(coredns Pod is NOT same node):

$ k exec -it util4 -- nslookup www.bing.com
Server:         10.32.0.10
Address:        10.32.0.10#53

Non-authoritative answer:
www.bing.com    canonical name = a-0001.a-afdentry.net.trafficmanager.net.
a-0001.a-afdentry.net.trafficmanager.net        canonical name = dual-a-0001.a-msedge.net.
Name:   dual-a-0001.a-msedge.net
Address: 13.107.21.200
Name:   dual-a-0001.a-msedge.net
Address: 204.79.197.200
Name:   dual-a-0001.a-msedge.net
Address: 2620:1ec:c11::200

I tried deleting coredns to move it to other node and this result was reverse as I expected.
I hope to name resolution both senarios.
Could you give me some idea?

@xaocon
Copy link

xaocon commented Jul 6, 2020

kubernetes/kubernetes#21613

modprobe br_netfilter on the workers

@tsubasaxZZZ
Copy link
Author

@xaocon Thank you!!! I tried and succeed!

@tsubasaxZZZ
Copy link
Author

I write down for parmanent enable this module just in case:

root@worker-1:/etc/modules-load.d# cat /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored_
br_netfilter

@xaocon
Copy link

xaocon commented Jul 7, 2020

Hey @tsubasaxZZZ, glad that helped. I think you should reopen this until there's a PR to fix it for everyone.

@tsubasaxZZZ tsubasaxZZZ reopened this Jul 7, 2020
@ivanfioravanti
Copy link
Owner

Please create a PR for this, it's great! @tsubasaxZZZ and @xaocon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants