Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker build dns resolution fails after a restart #66

Open
lestephane opened this issue Mar 16, 2022 · 3 comments
Open

docker build dns resolution fails after a restart #66

lestephane opened this issue Mar 16, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@lestephane
Copy link

Reproduction

  • start machine
  • wait for vpn to be up
  • snap remove --purge docker
  • snap install docker
  • docker build works
  • restart machine
  • docker build fails with Temporary failure in name resolution
$ docker build ...
...
Get "https://ghcr.io/v2/": dial tcp: lookup ghcr.io: Temporary failure in name resolution

snap stop --disable followed by snap start --enable does not help.

$ tree /var/snap/docker/current/
/var/snap/docker/current/
├── config
│   └── daemon.json
└── etc
    ├── docker
    │   └── key.json
    └── gitconfig
$ cat /var/snap/docker/current/config/daemon.json
{
    "log-level":        "error",
    "storage-driver":   "overlay2"
}
$ sudo systemctl status snap.docker.dockerd.service
● snap.docker.dockerd.service - Service for snap application docker.dockerd
     Loaded: loaded (/etc/systemd/system/snap.docker.dockerd.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2022-03-16 19:52:28 EET; 10min ago
   Main PID: 117245 (dockerd)
      Tasks: 37 (limit: 18775)
     Memory: 76.0M
     CGroup: /system.slice/snap.docker.dockerd.service
             ├─117245 dockerd --group docker --exec-root=/run/snap.docker --data-root=/var/snap/docker/common/var-lib-docker ...
                ...(continued) --pidfile=/run/snap.docker/docker.pid --config-file=/var/snap/docker/1458/config/daemon.json
             └─117318 containerd --config /run/snap.docker/containerd/containerd.toml --log-level error
$ cat /var/snap/docker/1458/config/daemon.json
{
    "log-level":        "error",
    "storage-driver":   "overlay2"
}

Is there a way to tell dockerd to refresh its dns configuration? Or to inspect what it's current configuration is?
There's no point in me defining a hardcoded dns in daemon.json, since it changes everytime I connect to
a different VPN. But it's OK for me to restart the docker daemon if I knew how to tell it to "use the current
dns from resolv.conf".

I suspect this error happens because when my laptop starts, there is a race condition between the vpn daemon
and dockerd daemon. The VPN comes up after dockerd has made its decision about which dns to use. The
moment the VPN comes up, the dns server in resolv.con changes, and dockerd is now using a stale value.

@tianon
Copy link
Contributor

tianon commented Mar 17, 2022

Huh, very interesting -- I think this error is actually coming from dockerd itself, which IIRC would respond to resolv.conf changes (the daemon "DNS" configuration isn't used for looking up registry domains, IIRC -- I believe that configuration is just a default for containers), so that makes me think that perhaps dockerd itself isn't getting the updated resolv.conf? Maybe you need to restart all of snapd as well? 😬

@neurer
Copy link

neurer commented Mar 28, 2022

I think my issue might be related. Recently moved to Ubuntu 22.04 dev release. Followed the familiar steps and have Docker Snap (docker 20.10.12 1690 latest/stable) up and running. ufw is enabled; no additional config.

docker-compose up -d --build comes up with this:

W: Failed to fetch http://deb.debian.org/debian/dists/bullseye/InRelease Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://security.debian.org/debian-security/dists/bullseye-security/InRelease Temporary failure resolving 'security.debian.org'
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye-updates/InRelease Temporary failure resolving 'deb.debian.org'
W: Some index files failed to download. They have been ignored, or old ones used instead.

ufw disable is required for it to build/finish. ufw enable followed by docker exec -it whatever bash and apt update then gives me this:

Err:1 http://deb.debian.org/debian bullseye InRelease
Could not connect to deb.debian.org:80 (151.101.14.132), connection timed out
Err:2 http://security.debian.org/debian-security bullseye-security InRelease
Could not connect to security.debian.org:80 (151.101.66.132), connection timed out Could not connect to security.debian.org:80 (151.101.2.132), connection timed out Could not connect to security.debian.org:80 (151.101.130.132), connection timed out Could not connect to security.debian.org:80 (151.101.194.132), connection timed out
Err:3 http://deb.debian.org/debian bullseye-updates InRelease
Could not connect to deb.debian.org:80 (151.101.242.132), connection timed out
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
18 packages can be upgraded. Run 'apt list --upgradable' to see them.
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye/InRelease Could not connect to deb.debian.org:80 (151.101.14.132), connection timed out
W: Failed to fetch http://security.debian.org/debian-security/dists/bullseye-security/InRelease Could not connect to security.debian.org:80 (151.101.66.132), connection timed out Could not connect to security.debian.org:80 (151.101.2.132), connection timed out Could not connect to security.debian.org:80 (151.101.130.132), connection timed out Could not connect to security.debian.org:80 (151.101.194.132), connection timed out
W: Failed to fetch http://deb.debian.org/debian/dists/bullseye-updates/InRelease Could not connect to deb.debian.org:80 (151.101.242.132), connection timed out
W: Some index files failed to download. They have been ignored, or old ones used instead.

ufw disable and apt will work fine. This is new behavior. Any suggestions would be much appreciated.

@shakeelansari63
Copy link

shakeelansari63 commented Aug 28, 2023

I guess there is some sort of race condition between NetworkManager and docker daemon which eventually breaks the docker daemon.

As a workaround, I have permanently disabled docker daemon auto start using sudo snap stop --disable docker.

Then restart the PC.

And whenever I need to use docker, I will simply start it using sudo snap start docker.

Now there is no conflict as docker does not auto start and I will start docker only when I need it.

For easy access, I have aliased the docker start and stop commands.

~/.bashrc

alias start-docker='sudo snap start docker'
alias stop-docker='sudo snap stop docker'

@locnnil locnnil added the bug Something isn't working label Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants