Skip to content
This repository has been archived by the owner on Jan 21, 2022. It is now read-only.

BOSH cck fails when attempting to recreate containers #353

Open
ryantang opened this issue Mar 23, 2016 · 3 comments
Open

BOSH cck fails when attempting to recreate containers #353

ryantang opened this issue Mar 23, 2016 · 3 comments

Comments

@ryantang
Copy link

https://github.com/cloudfoundry/bosh-lite/blob/master/docs/bosh-cck.md

The above instructions no longer works to recover bosh-lite containers upon reboot of the host (AWS EC2 instance in this case). The strange thing for my scenario is that bosh cck fails but one of the containers is fixed upon completion of that command. A subsequent bosh deploy adds the two additional missing containers.

ubuntu@agent-id-bosh-0:~$ bosh -d etcd-manifest.yml cck

<skipped some output> 

Scan is complete, checking if any problems found...

Found 3 problems

Problem 1 of 3: etcd_z1/2 (ed696055-75b8-4449-8807-fb1d0d091538) (c6396aee-ce0b-4dc0-44b6-d96f67506dff) is not responding.
  1. Skip for now
  2. Reboot VM
  3. Recreate VM for 'etcd_z1/2 (ed696055-75b8-4449-8807-fb1d0d091538)'
  4. Delete VM reference (forceful; may need to manually delete VM from the Cloud to avoid IP conflicts)
Please choose a resolution [1 - 4]: 3

<skipped more output; chose "3" or "recreate" for all VMs>

Director task 24
  Started applying problem resolutions
  Started applying problem resolutions > unresponsive_agent 3: Recreate VM for 'etcd_z1/2 (ed696055-75b8-4449-8807-fb1d0d091538)'. Failed: Creating VM with agent ID '5eda1ec1-3f7c-4e9b-875e-73a9d75209d0': Creating container: container: start: exit status 2 (00:00:11)
  Started applying problem resolutions > unresponsive_agent 1: Recreate VM for 'etcd_z1/0 (783a020c-f5d5-449e-b0f4-e48f8c74dfa9)'. Failed: Creating VM with agent ID '4a51a656-07e3-4c42-89e3-69faad4b205c': Creating container: container: start: exit status 2 (00:00:11)
  Started applying problem resolutions > unresponsive_agent 2: Recreate VM for 'etcd_z1/1 (e4ecbcb5-4ecd-488a-98a7-ad51c1ed6bf8)'. Done (00:00:52)
   Failed applying problem resolutions (00:01:14)

Error 100: Error resolving problem `7': Creating VM with agent ID '5eda1ec1-3f7c-4e9b-875e-73a9d75209d0': Creating container: container: start: exit status 2
Error resolving problem `8': Creating VM with agent ID '4a51a656-07e3-4c42-89e3-69faad4b205c': Creating container: container: start: exit status 2

Task 24 error

A snippet from the debug log:

I, [2016-03-23 16:29:57 #2526] [task:24]  INFO -- DirectorJobRunner: Creating VM
D, [2016-03-23 16:29:57 #2526] [task:24] DEBUG -- DirectorJobRunner: External CPI sending request: {"method":"create_vm","arguments":["4a51a656-07e3-4c42-89e3-69faad4b205c","9ff9ea15-73e5-467b-7fef-67d079850c1e",{},{"etcd1":{"ip":"10.244.4.4","netmask":"255.255.255.0","cloud_properties":{"name":"random"},"default":["dns","gateway"],"gateway":"10.244.4.1"}},["dac4223b-97c2-4031-6bb5-8718c908485f"],{}],"context":{"director_uuid":"60e5fa30-1059-4a03-8d85-351fde555719"}} with command: /var/vcap/jobs/warden_cpi/bin/cpi
D, [2016-03-23 16:29:58 #2526] [task:24] DEBUG -- DirectorJobRunner: External CPI got response: {"result":null,"error":{"type":"Bosh::Clouds::CloudError","message":"Creating VM with agent ID '4a51a656-07e3-4c42-89e3-69faad4b205c': Creating container: container: start: exit status 2","ok_to_retry":false},"log":""}, err: , exit_status: pid 3077 exit 0
E, [2016-03-23 16:29:58 #2526] [task:24] ERROR -- DirectorJobRunner: error creating vm: Creating VM with agent ID '4a51a656-07e3-4c42-89e3-69faad4b205c': Creating container: container: start: exit status 2
E, [2016-03-23 16:29:58 #2526] [task:24] ERROR -- DirectorJobRunner: Error resolving problem `8': Creating VM with agent ID '4a51a656-07e3-4c42-89e3-69faad4b205c': Creating container: container: start: exit status 2
E, [2016-03-23 16:29:58 #2526] [task:24] ERROR -- DirectorJobRunner: /var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_cpi-1.3215.0/lib/cloud/external_cpi.rb:119:in `handle_error'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_cpi-1.3215.0/lib/cloud/external_cpi.rb:88:in `invoke_cpi_method'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh_cpi-1.3215.0/lib/cloud/external_cpi.rb:51:in `create_vm'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3215.0/lib/bosh/director/vm_creator.rb:118:in `create'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3215.0/lib/bosh/director/vm_creator.rb:53:in `create_for_instance_plan'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3215.0/lib/bosh/director/cloudcheck_helper.rb:62:in `recreate_vm'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3215.0/lib/bosh/director/problem_handlers/unresponsive_agent.rb:37:in `block (2 levels) in <class:UnresponsiveAgent>'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3215.0/lib/bosh/director/problem_handlers/base.rb:83:in `instance_eval'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3215.0/lib/bosh/director/problem_handlers/base.rb:83:in `apply_resolution'
/var/vcap/packages/director/gem_home/ruby/2.1.0/gems/bosh-director-1.3215.0/lib/bosh/director/problem_resolver.rb:65:in `block in apply_resolution'
@ryantang
Copy link
Author

I am using bosh-stemcell-3147-warden-boshlite-ubuntu-trusty-go_agent.

@bosh-cpi-robot
Copy link

Can you add garden logs from this failure if you still have it? I suspect
garden-linux release is doing something different now.

On Wed, Mar 23, 2016 at 10:12 AM, Ryan Tang [email protected]
wrote:

I am using bosh-stemcell-3147-warden-boshlite-ubuntu-trusty-go_agent.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#353 (comment)

@ryantang
Copy link
Author

I re-ran the bosh cck command. Here's the info form garden.stdout.log. GUIDs will likely be a bit different.

This the part of the logs from the first recreate failure. Please let me know if you need more of the log file.

# tail -f ./vcap/data/sys/log/garden/garden.std*.log ./vcap/data/sys/log/monit/garden.log

==> ./vcap/data/sys/log/garden/garden.stdout.log <==
{"timestamp":"1458754867.144699812","source":"garden-linux","message":"garden-linux.container.cleanup.done","log_level":1,"data":{"handle":"0ddb9a4b-c740-4acf-4b68-bf643174e5e4","session":"18.2"}}
{"timestamp":"1458754867.144761324","source":"garden-linux","message":"garden-linux.pool.release.releasing","log_level":1,"data":{"id":"fk5rhq2q9r5","session":"9.4"}}
{"timestamp":"1458754867.277828693","source":"garden-linux","message":"garden-linux.pool.release.subnet-pool.release.changing-allocated-ips","log_level":1,"data":{"allocated-ips":["10.244.4.5","10.244.4.6"],"id":"fk5rhq2q9r5","session":"9.4.2.1"}}
{"timestamp":"1458754867.277896404","source":"garden-linux","message":"garden-linux.pool.release.released","log_level":1,"data":{"id":"fk5rhq2q9r5","session":"9.4"}}
{"timestamp":"1458754867.277940035","source":"garden-linux","message":"garden-linux.garden-server.destroy.destroyed","log_level":1,"data":{"handle":"0ddb9a4b-c740-4acf-4b68-bf643174e5e4","session":"11.5"}}
{"timestamp":"1458754867.351434708","source":"garden-linux","message":"garden-linux.pool.acquire.creating","log_level":1,"data":{"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5"}}
{"timestamp":"1458754867.351520538","source":"garden-linux","message":"garden-linux.pool.acquire.subnet-pool.acquire.subnet-selecting","log_level":1,"data":{"allocated-subnets":["10.244.4.0/24"],"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5.1.1"}}
{"timestamp":"1458754867.351566553","source":"garden-linux","message":"garden-linux.pool.acquire.subnet-pool.acquire.subnet-selected","log_level":1,"data":{"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5.1.1","subnet":"10.244.4.0/24"}}
{"timestamp":"1458754867.351622343","source":"garden-linux","message":"garden-linux.pool.acquire.subnet-pool.acquire.ip-selecting","log_level":1,"data":{"allocated-ips":["10.244.4.5","10.244.4.6"],"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5.1.1"}}
{"timestamp":"1458754867.351675749","source":"garden-linux","message":"garden-linux.pool.acquire.subnet-pool.acquire.ip-selected","log_level":1,"data":{"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","ip":"10.244.4.4","session":"9.5.1.1"}}
{"timestamp":"1458754867.351738691","source":"garden-linux","message":"garden-linux.pool.acquire.subnet-pool.acquire.new-allocated","log_level":1,"data":{"allocated-ips":["10.244.4.5","10.244.4.6","10.244.4.4"],"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5.1.1"}}
{"timestamp":"1458754867.351773262","source":"garden-linux","message":"garden-linux.pool.acquire.acquired-pool-resources","log_level":1,"data":{"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5"}}
{"timestamp":"1458754867.352807522","source":"garden-linux","message":"garden-linux.pool.acquire.running-graph-cleanup","log_level":1,"data":{"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5"}}
{"timestamp":"1458754867.353132248","source":"garden-linux","message":"garden-linux.pool.acquire.gc.start","log_level":1,"data":{"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5.2"}}
{"timestamp":"1458754867.643604040","source":"garden-linux","message":"garden-linux.pool.acquire.created","log_level":1,"data":{"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"9.5"}}
{"timestamp":"1458754867.827059984","source":"garden-linux","message":"garden-linux.container.start.command.failed","log_level":2,"data":{"argv":["/var/vcap/data/garden/depot/fk9jojisg5i/start.sh"],"error":"exit status 2","exit-status":2,"handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"23.1.1","stderr":"Failed to create container: containerizer: run `parent-after-clone`: exit status 2. stderr: panic: network: failed to find bridge with name 'wb-fk5rhq2pe5m0', IP '10.244.4.1', subnet '10.244.4.0/24': look up existing bridge\n\ngoroutine 1 [running]:\ngithub.com/cloudfoundry-incubator/garden-linux/linux_backend.must(0x7fbdfea00368, 0xc2080325c0)\n\t/var/vcap/packages/garden-linux/src/github.com/cloudfoundry-incubator/garden-linux/linux_backend/hooks.go:95 +0x50\ngithub.com/cloudfoundry-incubator/garden-linux/linux_backend.func·002()\n\t/var/vcap/packages/garden-linux/src/github.com/cloudfoundry-incubator/garden-linux/linux_backend/hooks.go:28 +0xf5\ngithub.com/cloudfoundry-incubator/garden-linux/hook.HookSet.Main(0xc20803abd0, 0x7ffc5e42bd18, 0x12)\n\t/var/vcap/packages/garden-linux/src/github.com/cloudfoundry-incubator/garden-linux/hook/hooks.go:28 +0x81\ngithub.com/cloudfoundry-incubator/garden-linux/hook.Main(0xc20800a010, 0x1, 0x1)\n\t/var/vcap/packages/garden-linux/src/github.com/cloudfoundry-incubator/garden-linux/hook/hooks.go:19 +0x46\nmain.main()\n\t/var/vcap/packages/garden-linux/src/github.com/cloudfoundry-incubator/garden-linux/hook/hook/main.go:39 +0x4c2\n","stdout":"","took":"136.461385ms"}}
{"timestamp":"1458754867.827219963","source":"garden-linux","message":"garden-linux.container.start.wshd-start-failed","log_level":2,"data":{"error":"exit status 2","handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","session":"23.1"}}
{"timestamp":"1458754867.827310324","source":"garden-linux","message":"garden-linux.pool.release.releasing","log_level":1,"data":{"id":"fk9jojisg5i","session":"9.6"}}
{"timestamp":"1458754867.954038620","source":"garden-linux","message":"garden-linux.pool.release.subnet-pool.release.changing-allocated-ips","log_level":1,"data":{"allocated-ips":["10.244.4.5","10.244.4.6"],"id":"fk9jojisg5i","session":"9.6.2.1"}}
{"timestamp":"1458754867.954099178","source":"garden-linux","message":"garden-linux.pool.release.released","log_level":1,"data":{"id":"fk9jojisg5i","session":"9.6"}}
{"timestamp":"1458754867.954126835","source":"garden-linux","message":"garden-linux.garden-server.create.failed","log_level":2,"data":{"error":"container: start: exit status 2","request":{"Handle":"8c25a9d3-4a5c-480a-4539-24ce4d379235","GraceTime":0,"RootFSPath":"/var/vcap/store/warden_cpi/stemcells/9ff9ea15-73e5-467b-7fef-67d079850c1e","BindMounts":[{"src_path":"/var/vcap/store/warden_cpi/ephemeral_bind_mounts_dir/8c25a9d3-4a5c-480a-4539-24ce4d379235","dst_path":"/var/vcap/data","mode":1},{"src_path":"/var/vcap/store/warden_cpi/persistent_bind_mounts_dir/8c25a9d3-4a5c-480a-4539-24ce4d379235","dst_path":"/warden-cpi-dev","mode":1}],"Network":"10.244.4.4/24","Privileged":true,"Limits":{"bandwidth_limits":{},"cpu_limits":{},"disk_limits":{},"memory_limits":{}}},"session":"11.6"}}

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants