You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Steps to Reproduce:
rke up --ssh-agent-auth --config ./cluster.yml Results:
INFO[0000] Building Kubernetes cluster
INFO[0000] [dialer] Setup tunnel for host [192.168.20.20]
WARN[0000] Unsupported Docker version found [18.03.1-ce], supported versions are [1.11.x 1.12.x 1.13.x 17.03.x]
INFO[0000] [dialer] Setup tunnel for host [192.168.20.21]
WARN[0001] Unsupported Docker version found [18.03.1-ce], supported versions are [1.11.x 1.12.x 1.13.x 17.03.x]
INFO[0001] [dialer] Setup tunnel for host [192.168.20.22]
WARN[0002] Unsupported Docker version found [18.03.1-ce], supported versions are [1.11.x 1.12.x 1.13.x 17.03.x]
INFO[0002] [state] Found local kube config file, trying to get state from cluster
INFO[0002] [state] Fetching cluster state from Kubernetes
INFO[0002] [state] Successfully Fetched cluster state to Kubernetes ConfigMap: cluster-state
INFO[0002] [certificates] Getting Cluster certificates from Kubernetes
INFO[0002] [certificates] Successfully fetched Cluster certificates from Kubernetes
INFO[0002] [network] No hosts added existing cluster, skipping port check
INFO[0002] [reconcile] Reconciling cluster state
INFO[0002] [reconcile] Check etcd hosts to be deleted
INFO[0002] [reconcile] Check etcd hosts to be added
INFO[0002] [reconcile] Rebuilding and updating local kube config
INFO[0002] Successfully Deployed local admin kubeconfig at [./kube_config_rancher-cluster.yml]
INFO[0002] [reconcile] host [192.168.20.20] is active master on the cluster
INFO[0002] [reconcile] Reconciled cluster state successfully
INFO[0002] [certificates] Deploying kubernetes certificates to Cluster nodes
INFO[0016] Successfully Deployed local admin kubeconfig at [./kube_config_rancher-cluster.yml]
INFO[0016] [certificates] Successfully deployed kubernetes certificates to Cluster nodes
INFO[0016] Pre-pulling kubernetes images
INFO[0016] Kubernetes images pulled successfully
INFO[0016] [etcd] Building up etcd plane..
INFO[0016] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [192.168.20.20]
INFO[0027] [certificates] Successfully started [rke-bundle-cert] container on host [192.168.20.20]
INFO[0028] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [192.168.20.20]
INFO[0036] [etcd] Successfully started [rke-log-linker] container on host [192.168.20.20]
INFO[0038] [remove/rke-log-linker] Successfully removed container on host [192.168.20.20]
INFO[0038] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [192.168.20.21]
INFO[0050] [certificates] Successfully started [rke-bundle-cert] container on host [192.168.20.21]
INFO[0050] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [192.168.20.21]
INFO[0058] [etcd] Successfully started [rke-log-linker] container on host [192.168.20.21]
INFO[0059] [remove/rke-log-linker] Successfully removed container on host [192.168.20.21]
INFO[0059] [etcd] Saving snapshot [etcd-rolling-snapshots] on host [192.168.20.22]
INFO[0071] [certificates] Successfully started [rke-bundle-cert] container on host [192.168.20.22]
INFO[0072] [certificates] successfully saved certificate bundle [/opt/rke/etcd-snapshots//pki.bundle.tar.gz] on host [192.168.20.22]
INFO[0080] [etcd] Successfully started [rke-log-linker] container on host [192.168.20.22]
INFO[0081] [remove/rke-log-linker] Successfully removed container on host [192.168.20.22]
INFO[0081] [etcd] Successfully started etcd plane..
INFO[0081] [controlplane] Building up Controller Plane..
INFO[0083] [remove/service-sidekick] Successfully removed container on host [192.168.20.21]
INFO[0083] [remove/service-sidekick] Successfully removed container on host [192.168.20.20]
INFO[0083] [remove/service-sidekick] Successfully removed container on host [192.168.20.22]
INFO[0088] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [192.168.20.21]
INFO[0089] [healthcheck] service [kube-apiserver] on host [192.168.20.21] is healthy
INFO[0089] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [192.168.20.20]
INFO[0089] [healthcheck] Start Healthcheck on service [kube-apiserver] on host [192.168.20.22]
INFO[0089] [healthcheck] service [kube-apiserver] on host [192.168.20.20] is healthy
INFO[0089] [healthcheck] service [kube-apiserver] on host [192.168.20.22] is healthy
INFO[0095] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.21]
INFO[0096] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.22]
INFO[0096] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.20]
INFO[0097] [remove/rke-log-linker] Successfully removed container on host [192.168.20.21]
INFO[0097] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [192.168.20.21]
INFO[0097] [healthcheck] service [kube-controller-manager] on host [192.168.20.21] is healthy
INFO[0097] [remove/rke-log-linker] Successfully removed container on host [192.168.20.22]
INFO[0097] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [192.168.20.22]
INFO[0097] [healthcheck] service [kube-controller-manager] on host [192.168.20.22] is healthy
INFO[0097] [remove/rke-log-linker] Successfully removed container on host [192.168.20.20]
INFO[0098] [healthcheck] Start Healthcheck on service [kube-controller-manager] on host [192.168.20.20]
INFO[0098] [healthcheck] service [kube-controller-manager] on host [192.168.20.20] is healthy
INFO[0103] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.21]
INFO[0104] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.22]
INFO[0104] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.20]
INFO[0105] [remove/rke-log-linker] Successfully removed container on host [192.168.20.21]
INFO[0105] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [192.168.20.21]
INFO[0105] [healthcheck] service [kube-scheduler] on host [192.168.20.21] is healthy
INFO[0106] [remove/rke-log-linker] Successfully removed container on host [192.168.20.22]
INFO[0106] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [192.168.20.22]
INFO[0106] [healthcheck] service [kube-scheduler] on host [192.168.20.22] is healthy
INFO[0106] [remove/rke-log-linker] Successfully removed container on host [192.168.20.20]
INFO[0106] [healthcheck] Start Healthcheck on service [kube-scheduler] on host [192.168.20.20]
INFO[0106] [healthcheck] service [kube-scheduler] on host [192.168.20.20] is healthy
INFO[0111] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.21]
INFO[0113] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.22]
INFO[0113] [remove/rke-log-linker] Successfully removed container on host [192.168.20.21]
INFO[0113] [controlplane] Successfully started [rke-log-linker] container on host [192.168.20.20]
INFO[0114] [remove/rke-log-linker] Successfully removed container on host [192.168.20.22]
INFO[0115] [remove/rke-log-linker] Successfully removed container on host [192.168.20.20]
INFO[0115] [controlplane] Successfully started Controller Plane..
INFO[0115] [authz] Creating rke-job-deployer ServiceAccount
INFO[0115] [authz] rke-job-deployer ServiceAccount created successfully
INFO[0115] [authz] Creating system:node ClusterRoleBinding
INFO[0115] [authz] system:node ClusterRoleBinding created successfully
INFO[0115] [certificates] Save kubernetes certificates as secrets
INFO[0118] [certificates] Successfully saved certificates as kubernetes secret [k8s-certs]
INFO[0118] [state] Saving cluster state to Kubernetes
INFO[0118] [state] Successfully Saved cluster state to Kubernetes ConfigMap: cluster-state
INFO[0118] [state] Saving cluster state to cluster nodes
INFO[0125] [state] Successfully started [cluster-state-deployer] container on host [192.168.20.20]
INFO[0127] [remove/cluster-state-deployer] Successfully removed container on host [192.168.20.20]
INFO[0133] [state] Successfully started [cluster-state-deployer] container on host [192.168.20.21]
INFO[0135] [remove/cluster-state-deployer] Successfully removed container on host [192.168.20.21]
INFO[0141] [state] Successfully started [cluster-state-deployer] container on host [192.168.20.22]
INFO[0143] [remove/cluster-state-deployer] Successfully removed container on host [192.168.20.22]
INFO[0143] [worker] Building up Worker Plane..
INFO[0144] [remove/service-sidekick] Successfully removed container on host [192.168.20.21]
INFO[0144] [remove/service-sidekick] Successfully removed container on host [192.168.20.20]
INFO[0144] [remove/service-sidekick] Successfully removed container on host [192.168.20.22]
FATA[0157] [workerPlane] Failed to bring up Worker Plane: Failed to start [kubelet] container on host [192.168.20.21]: Failed to start [kubelet] container on host [192.168.20.21]: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:109: jailing process inside rootfs caused \\\"pivot_root invalid argument\\\"\"": unknown
The specific issue is the last FATA error: FATA[0157] [workerPlane] Failed to bring up Worker Plane: Failed to start [kubelet] container on host [192.168.20.21]: Failed to start [kubelet] container on host [192.168.20.21]: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:109: jailing process inside rootfs caused \\\"pivot_root invalid argument\\\"\"": unknown
I can get the cluster to come up without error if I delete kublet on each node docker rm kubelet
And then remove the bind mount
kubelet:
extra_binds:
- /mnt:/mnt:rshared
From cluster.yml and re-run the rke up command everything works as expected.
I need to be able to use local volumes and as such the kublet needs to bind mount the host location as per #500, when I do what #500 suggests I get the error found in #316 but I don't have an unattended AMI update (at least I have not done anything to create such)
Further diging with docker inspect kubelet shows the following Binds on a working configuration:
I can confirm that the problem is trying to use a ZFS backed docker that mounts the docker directory under /mnt. When following the instructions for setting this up https://rancher.com/docs/os/v1.2/en/storage/using-zfs/
Specifically:
$ sudo ros config set rancher.docker.storage_driver 'zfs'
$ sudo ros config set rancher.docker.graph /mnt/zpool1/docker
Rancher must ensure that kublet gets /mnt/zpool1/docker (in my case /mnt/docker-zpool/docker-containers). This conflicts with adding
kubelet:
extra_binds:
- /mnt:/mnt:rshared
The system works as expected if I manually specify the subdirectories in /mnt that I want to use (for example)
RKE version:
rke version v0.1.15
Docker version: (
docker version
,docker info
preferred)Operating system and kernel: (
cat /etc/os-release
,uname -r
preferred)4.14.73-rancher
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
Bare-metal
cluster.yml file:
Steps to Reproduce:
rke up --ssh-agent-auth --config ./cluster.yml
Results:
The specific issue is the last FATA error:
FATA[0157] [workerPlane] Failed to bring up Worker Plane: Failed to start [kubelet] container on host [192.168.20.21]: Failed to start [kubelet] container on host [192.168.20.21]: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:109: jailing process inside rootfs caused \\\"pivot_root invalid argument\\\"\"": unknown
I can get the cluster to come up without error if I delete kublet on each node
docker rm kubelet
And then remove the bind mount
From cluster.yml and re-run the rke up command everything works as expected.
I need to be able to use local volumes and as such the kublet needs to bind mount the host location as per #500, when I do what #500 suggests I get the error found in #316 but I don't have an unattended AMI update (at least I have not done anything to create such)
Further diging with
docker inspect kubelet
shows the following Binds on a working configuration:I suspect what is happening is that the /mnt:/mnt:rshared is not working because /mnt/docker-zpool/docker-containers has already been processed
The text was updated successfully, but these errors were encountered: