-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contiv installer - Intermittent Install failures seen w/ latest 1.1.7 installer bits #340
Comments
Looking at the attached logs contiv_install_01-22-2018.09-34-14.UTC.log and contiv_install_01-25-2018.05-56-47.UTC.log, I see failures when the contiv docker v2plugin was installed. The following command failed on both master and worker nodes in the logs:
Can you send the logs in |
Worker node install failures - Master node intall failures - (as observed on 2nd cluter) - in this case master nodes doesn't have Also
|
@rkharya: Have you reproduced this on CentOS or on another distribution? |
@unclejack: Reproducible on RHEL7.3 environments - BareMetal and BareMetal with VMs |
Description
v2Plugin installation failures seen multiple times on 2 different setups.
There are different error messages for the failure for Contiv master and Contiv worker nodes.
Expected Behavior
Contiv install should succeed on all Master/Worker Nodes w/o any errors.
Observed Behavior
Issue is being seen intermittently but can be stated for sure - After complete clean-up of the Docker Swarm cluster from Contiv bits, first iteration of installation fails then subsequent re-try eventually succeeds in installing Contiv. This behaviour is being seen only with the latest code-changes done some 20 days back on 1.1.7 release. We have not seen this issue during the CVD validation cycle till the CVD was released on Dec'18th, 2017.
##Master Node install failures -
TASK [contiv_network : install v2plugin on master nodes] ***********************
fatal: [node2]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.122.63 control_url=10.65.122.63:9999 vxlan_port=8472 iflist=eno6 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=master fwd_mode=bridge", "delta": "0:06:11.601524", "end": "2018-01-22 15:11:25.034534", "failed": true, "rc": 1, "start": "2018-01-22 15:05:13.433010", "stderr": "Error response from daemon: dial unix /run/docker/plugins/330e5e6cb7025e7c40805912541ff706fad4d35eb4bb34b877ea5004dfcf8511/netplugin.sock: connect: connection refused", "stderr_lines": ["Error response from daemon: dial unix /run/docker/plugins/330e5e6cb7025e7c40805912541ff706fad4d35eb4bb34b877ea5004dfcf8511/netplugin.sock: connect: connection refused"], "stdout": "1.1.7: Pulling from contiv/v2plugin\n1ba3fc0d8c93: Verifying Checksum\n1ba3fc0d8c93: Download complete\nDigest: sha256:2b610546b385bcc46ca6c76a9be7fd859a3abf4b37f529ba9df41a4dc3853c30\nStatus: Downloaded newer image for contiv/v2plugin:1.1.7", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin", "1ba3fc0d8c93: Verifying Checksum", "1ba3fc0d8c93: Download complete", "Digest: sha256:2b610546b385bcc46ca6c76a9be7fd859a3abf4b37f529ba9df41a4dc3853c30", "Status: Downloaded newer image for contiv/v2plugin:1.1.7"]}
fatal: [node1]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.122.61 control_url=10.65.122.61:9999 vxlan_port=8472 iflist=eno6 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=master fwd_mode=bridge", "delta": "0:06:12.083192", "end": "2018-01-22 15:11:25.836960", "failed": true, "rc": 1, "start": "2018-01-22 15:05:13.753768", "stderr": "Error response from daemon: dial unix /run/docker/plugins/6f11c1b2fea19a72d9aa2ef95c0e85c224891f982826f815ff8a556dc640e48c/netplugin.sock: connect: no such file or directory", "stderr_lines": ["Error response from daemon: dial unix /run/docker/plugins/6f11c1b2fea19a72d9aa2ef95c0e85c224891f982826f815ff8a556dc640e48c/netplugin.sock: connect: no such file or directory"], "stdout": "1.1.7: Pulling from contiv/v2plugin\n1ba3fc0d8c93: Verifying Checksum\n1ba3fc0d8c93: Download complete\nDigest: sha256:2b610546b385bcc46ca6c76a9be7fd859a3abf4b37f529ba9df41a4dc3853c30\nStatus: Downloaded newer image for contiv/v2plugin:1.1.7", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin", "1ba3fc0d8c93: Verifying Checksum", "1ba3fc0d8c93: Download complete", "Digest: sha256:2b610546b385bcc46ca6c76a9be7fd859a3abf4b37f529ba9df41a4dc3853c30", "Status: Downloaded newer image for contiv/v2plugin:1.1.7"]}
fatal: [node3]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.122.62 control_url=10.65.122.62:9999 vxlan_port=8472 iflist=eno6 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=master fwd_mode=bridge", "delta": "0:06:12.404043", "end": "2018-01-22 15:11:25.136644", "failed": true, "rc": 1, "start": "2018-01-22 15:05:12.732601", "stderr": "Error response from daemon: dial unix /run/docker/plugins/9c15133fdbe9ee55f4054b0f3af7fbd9be9ae8efc0bfd72d70b791f3ecfb27fd/netplugin.sock: connect: no such file or directory", "stderr_lines": ["Error response from daemon: dial unix /run/docker/plugins/9c15133fdbe9ee55f4054b0f3af7fbd9be9ae8efc0bfd72d70b791f3ecfb27fd/netplugin.sock: connect: no such file or directory"], "stdout": "1.1.7: Pulling from contiv/v2plugin\n1ba3fc0d8c93: Verifying Checksum\n1ba3fc0d8c93: Download complete\nDigest: sha256:2b610546b385bcc46ca6c76a9be7fd859a3abf4b37f529ba9df41a4dc3853c30\nStatus: Downloaded newer image for contiv/v2plugin:1.1.7", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin", "1ba3fc0d8c93: Verifying Checksum", "1ba3fc0d8c93: Download complete", "Digest: sha256:2b610546b385bcc46ca6c76a9be7fd859a3abf4b37f529ba9df41a4dc3853c30", "Status: Downloaded newer image for contiv/v2plugin:1.1.7"]}
to retry, use: --limit @/ansible/install_plays.retry
PLAY RECAP *********************************************************************
node1 : ok=17 changed=9 unreachable=0 failed=1
node2 : ok=17 changed=9 unreachable=0 failed=1
node3 : ok=17 changed=9 unreachable=0 failed=1
node4 : ok=9 changed=4 unreachable=0 failed=0
node5 : ok=9 changed=4 unreachable=0 failed=0
node6 : ok=9 changed=4 unreachable=0 failed=0
node7 : ok=9 changed=4 unreachable=0 failed=0
node8 : ok=9 changed=4 unreachable=0 failed=0
node9 : ok=9 changed=4 unreachable=0 failed=0
##Worker Node install failures -
TASK [contiv_network : install v2plugin on worker nodes] ***********************
fatal: [node6]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.121.140 control_url=10.65.121.140:9999 vxlan_port=8472 iflist=ens192 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=worker fwd_mode=bridge", "delta": "0:04:51.934836", "end": "2018-01-25 11:38:37.231374", "failed": true, "rc": 1, "start": "2018-01-25 11:33:45.296538", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
fatal: [node7]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.121.141 control_url=10.65.121.141:9999 vxlan_port=8472 iflist=ens192 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=worker fwd_mode=bridge", "delta": "0:04:52.343379", "end": "2018-01-25 11:38:44.770569", "failed": true, "rc": 1, "start": "2018-01-25 11:33:52.427190", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
fatal: [node4]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.121.142 control_url=10.65.121.142:9999 vxlan_port=8472 iflist=ens192 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=worker fwd_mode=bridge", "delta": "0:04:52.475222", "end": "2018-01-25 11:38:46.382501", "failed": true, "rc": 1, "start": "2018-01-25 11:33:53.907279", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
fatal: [node8]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.121.130 control_url=10.65.121.130:9999 vxlan_port=8472 iflist=ens192 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=worker fwd_mode=bridge", "delta": "0:04:54.685860", "end": "2018-01-25 11:38:48.099427", "failed": true, "rc": 1, "start": "2018-01-25 11:33:53.413567", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
fatal: [node5]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.121.143 control_url=10.65.121.143:9999 vxlan_port=8472 iflist=ens192 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=worker fwd_mode=bridge", "delta": "0:04:55.817107", "end": "2018-01-25 11:38:49.210135", "failed": true, "rc": 1, "start": "2018-01-25 11:33:53.393028", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
fatal: [node12]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.121.129 control_url=10.65.121.129:9999 vxlan_port=8472 iflist=ens192 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=worker fwd_mode=bridge", "delta": "0:01:54.202116", "end": "2018-01-25 11:40:35.330632", "failed": true, "rc": 1, "start": "2018-01-25 11:38:41.128516", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
fatal: [node11]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.121.128 control_url=10.65.121.128:9999 vxlan_port=8472 iflist=ens192 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=worker fwd_mode=bridge", "delta": "0:01:56.424311", "end": "2018-01-25 11:40:43.263658", "failed": true, "rc": 1, "start": "2018-01-25 11:38:46.839347", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
fatal: [node9]: FAILED! => {"changed": true, "cmd": "/usr/bin/docker plugin install --grant-all-permissions contiv/v2plugin:1.1.7 ctrl_ip=10.65.121.124 control_url=10.65.121.124:9999 vxlan_port=8472 iflist=eno6 plugin_name=contiv/v2plugin:1.1.7 cluster_store=etcd://localhost:2379 plugin_role=worker fwd_mode=bridge", "delta": "0:02:54.790835", "end": "2018-01-25 11:41:46.656811", "failed": true, "rc": 1, "start": "2018-01-25 11:38:51.865976", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
changed: [node10]
PLAY RECAP *********************************************************************
node1 : ok=38 changed=19 unreachable=0 failed=0
node10 : ok=23 changed=14 unreachable=0 failed=0
node11 : ok=16 changed=9 unreachable=0 failed=1
node12 : ok=16 changed=9 unreachable=0 failed=1
node2 : ok=37 changed=18 unreachable=0 failed=0
node3 : ok=37 changed=18 unreachable=0 failed=0
node4 : ok=16 changed=9 unreachable=0 failed=1
node5 : ok=16 changed=9 unreachable=0 failed=1
node6 : ok=16 changed=9 unreachable=0 failed=1
node7 : ok=16 changed=9 unreachable=0 failed=1
node8 : ok=16 changed=9 unreachable=0 failed=1
node9 : ok=16 changed=9 unreachable=0 failed=1
##Worker node failure key error message -
failed": true, "rc": 1, "start": "2018-01-25 11:33:45.296538", "stderr": "failed to download: unexpected EOF", "stderr_lines": ["failed to download: unexpected EOF"], "stdout": "1.1.7: Pulling from contiv/v2plugin", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin"]}
##Master node failure key error message -
"stderr": "Error response from daemon: dial unix /run/docker/plugins/330e5e6cb7025e7c40805912541ff706fad4d35eb4bb34b877ea5004dfcf8511/netplugin.sock: connect: connection refused", "stderr_lines": ["Error response from daemon: dial unix /run/docker/plugins/330e5e6cb7025e7c40805912541ff706fad4d35eb4bb34b877ea5004dfcf8511/netplugin.sock: connect: connection refused"], "stdout": "1.1.7: Pulling from contiv/v2plugin\n1ba3fc0d8c93: Verifying Checksum\n1ba3fc0d8c93: Download complete\nDigest: sha256:2b610546b385bcc46ca6c76a9be7fd859a3abf4b37f529ba9df41a4dc3853c30\nStatus: Downloaded newer image for contiv/v2plugin:1.1.7", "stdout_lines": ["1.1.7: Pulling from contiv/v2plugin", "1ba3fc0d8c93: Verifying Checksum", "1ba3fc0d8c93: Download complete", "Digest: sha256:2b610546b385bcc46ca6c76a9be7fd859a3abf4b37f529ba9df41a4dc3853c30", "Status: Downloaded newer image for contiv/v2plugin:1.1.7"]}
Steps to Reproduce (for bugs)
./install/ansible/install_swarm.sh -f install/ansible/cfg.yml -u root -e ~/.ssh/id_rsa -p
Your Environment
##Installation logs are attached herewith -
contiv_install_01-22-2018.09-34-14.UTC.log
contiv_install_01-25-2018.05-56-47.UTC.log
The text was updated successfully, but these errors were encountered: