Pods stuck in unknown state after reboot #1520

davigar15 · 2020-08-27T08:57:57Z

I'm using latest/edge (with calico cni) and after rebooting the machine I'm getting all pods in Unknown state.

Logs of the calico node:

2020-08-27 07:46:50.152 [INFO][8] startup.go 290: Early log level set to info
2020-08-27 07:46:50.152 [INFO][8] startup.go 306: Using NODENAME environment for node name
2020-08-27 07:46:50.152 [INFO][8] startup.go 318: Determined node name: davigar15
2020-08-27 07:46:50.153 [INFO][8] startup.go 350: Checking datastore connection
2020-08-27 07:46:50.159 [INFO][8] startup.go 374: Datastore connection verified
2020-08-27 07:46:50.159 [INFO][8] startup.go 102: Datastore is ready
2020-08-27 07:46:50.170 [INFO][8] startup.go 652: Using autodetected IPv4 address on interface lxdbr0: 172.16.100.1/24
2020-08-27 07:46:50.170 [INFO][8] startup.go 715: No AS number configured on node resource, using global value
2020-08-27 07:46:50.170 [INFO][8] startup.go 171: Setting NetworkUnavailable to False
2020-08-27 07:46:50.191 [INFO][8] startup.go 764: found v6= in the kubeadm config map
2020-08-27 07:46:50.210 [INFO][8] startup.go 598: FELIX_IPV6SUPPORT is false through environment variable
2020-08-27 07:46:50.232 [INFO][8] startup.go 215: Using node name: davigar15
2020-08-27 07:46:50.274 [INFO][32] allocateip.go 144: Current address is still valid, do nothing currentAddr="10.1.245.64" type="vxlanTunnelAddress"
CALICO_NETWORKING_BACKEND is vxlan - no need to run a BGP daemon
Calico node started successfully

An interesting this if that calico is detecting the network used for LXD.

Following @ktsakalozos suggestions, I added this in /var/snap/microk8s/current/args/cni-network/cni.yaml and apply that spec.

             - name: IP_AUTODETECTION_METHOD
              value: "can-reach=192.168.0.0"

The calico node did not restart, so I kill it to force the restart. But it did not come up even with microk8s.stop && microk8s.start

This is the tarball generated by microk8s.inspect

inspection-report.zip

The text was updated successfully, but these errors were encountered:

zar3bski · 2020-09-08T14:10:22Z

Experiencing similar issues with dashboard here. Is microk8s inspect displaying any alert?

ktsakalozos · 2020-09-09T08:06:51Z

@zar3bski could you attach the microk8s inspect tarball? It is hard to say what may be wrong.

ktsakalozos · 2020-09-09T08:07:41Z

@davigar15 the attached inspection report seems corrupted.

zar3bski · 2020-09-17T15:01:59Z

@zar3bski could you attach the microk8s inspect tarball? It is hard to say what may be wrong.

Here it is.
inspection-report-20200908_153805.tar.gz

Since then, I also tried to remove the pods manually. It remained stuck on Pending since then

kubectl get -n kube-system all
NAME                                            READY   STATUS    RESTARTS   AGE
pod/coredns-588fd544bf-cmtch                    0/1     Unknown   11         56d
pod/dashboard-metrics-scraper-59f5574d4-fnwf4   0/1     Unknown   9          55d
pod/hostpath-provisioner-75fdc8fccd-qtdz2       0/1     Unknown   11         56d
pod/kubernetes-dashboard-6d97855997-2nglv       0/1     Pending   0          9d
pod/metrics-server-c65c9d66-7tppz               0/1     Unknown   9          55d

NAME                                TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                  AGE
service/dashboard-metrics-scraper   ClusterIP   10.152.183.173   <none>        8000/TCP                 55d
service/kube-dns                    ClusterIP   10.152.183.10    <none>        53/UDP,53/TCP,9153/TCP   56d
service/kubernetes-dashboard        ClusterIP   10.152.183.73    <none>        443/TCP                  55d
service/metrics-server              ClusterIP   10.152.183.240   <none>        443/TCP                  55d

NAME                                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/coredns                     0/1     1            0           56d
deployment.apps/dashboard-metrics-scraper   0/1     1            0           55d
deployment.apps/hostpath-provisioner        0/1     1            0           56d
deployment.apps/kubernetes-dashboard        0/1     1            0           55d
deployment.apps/metrics-server              0/1     1            0           55d

NAME                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/coredns-588fd544bf                    1         1         0       56d
replicaset.apps/dashboard-metrics-scraper-59f5574d4   1         1         0       55d
replicaset.apps/hostpath-provisioner-75fdc8fccd       1         1         0       56d
replicaset.apps/kubernetes-dashboard-6d97855997       1         1         0       55d
replicaset.apps/metrics-server-c65c9d66               1         1         0       55d

inspection-report-20200917_165856.tar.gz
I also noticed something new

 WARNING:  IPtables FORWARD policy is DROP. Consider enabling traffic forwarding with: sudo iptables -P FORWARD ACCEPT 
The change can be made persistent with: sudo apt-get install iptables-persistent
WARNING:  Docker is installed. 
File "/etc/docker/daemon.json" does not exist. 
You should create it and add the following lines: 
{
    "insecure-registries" : ["localhost:32000"] 
}
and then restart docker with: sudo systemctl restart docker
Building the report tarball

Tried both fixes but it did not change much

PreethikaP · 2020-10-05T14:08:52Z

Facing same issue after restart of VM, the pods are in unknown state.
Though microk8s status and inspect shows all running but pods are in unknown state.
$ microk8s kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
hostpath-provisioner-5c65fbdb4f-n5b6q 0/1 Unknown 2 10d
calico-kube-controllers-847c8c99d-vn9tq 0/1 Unknown 2 10d
coredns-86f78bb79c-q86wm 0/1 Unknown 2 10d
tiller-deploy-575fcb6dfd-ljj4v 0/1 Unknown 1 10d
calico-node-8twws 1/1 Running 6 10d

Tried restarting service and stop and stop of microk8s service as well.
Have attached microk8s inspect for reference
inspection-report-20201005_133027.tar.gz

ktsakalozos · 2020-10-06T13:00:33Z

Thank you for you patience and apologies for the inconvenience this issue may have caused.

When the node starts it needs to invalidate IPs and update pods with new IPs. In kubelet logs you can see this call failing:

Oct 05 13:30:09 imsdev microk8s.daemon-containerd[31205]: time="2020-10-05T13:30:09.146826904Z" level=info msg="StopPodSandbox for \"5bfb7fe3babb8b47c627bbe9c7b67d567a86abae2b47ed36b856ec571bb6c668\""
Oct 05 13:30:09 imsdev microk8s.daemon-containerd[31205]: time="2020-10-05T13:30:09.146930076Z" level=info msg="Container to stop \"c002252db6a93d65b57330a0175059ae0520a730a90b7cef3efc77f01aeb6370\" must be in running or unknown state, current state \"CONTAINER_EXITED\""
Oct 05 13:30:09 imsdev microk8s.daemon-containerd[31205]: 2020-10-05 13:30:09.167 [ERROR][9700] customresource.go 136: Error updating resource Key=IPAMBlock(10-1-90-192-26) Name="10-1-90-192-26" Resource="IPAMBlocks" Value=&v3.IPAMBlock{TypeMeta:v1.TypeMeta{Kind:"IPAMBlock", APIVersion:"crd.projectcalico.org/v1"}, ObjectMeta:v1.ObjectMeta{Name:"10-1-90-192-26", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"1159920", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v3.IPAMBlockSpec{CIDR:"10.1.90.192/26", Affinity:(*string)(0xc00027e1b0), StrictAffinity:false, Allocations:[]*int{(*int)(0xc0003e10d0), (*int)(0xc0003e1100), (*int)(nil), (*int)(nil), (*int)(0xc0003e1110), (*int)(nil), (*int)(nil), (*int)(0xc0003e1120), (*int)(0xc0003e1140), (*int)(0xc0003e1130), (*int)(0xc0003e10f0), (*int)(nil), (*int)(nil), (*int)(0xc0003e1220), (*int)(nil), (*int)(0xc0003e1150), (*int)(nil), (*int)(0xc0003e10e0), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(0xc0003e11d0), (*int)(0xc0003e1250), (*int)(0xc0003e1240), (*int)(nil), (*int)(nil), (*int)(0xc0003e11c0), (*int)(0xc0003e1210), (*int)(nil), (*int)(0xc0003e1260), (*int)(0xc0003e11b0), (*int)(0xc0003e11a0), (*int)(nil), (*int)(nil), (*int)(0xc0003e1190), (*int)(nil), (*int)(0xc0003e1160), (*int)(0xc0003e1170), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(0xc0003e1290), (*int)(nil), (*int)(0xc0003e1280), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(0xc0003e12c0), (*int)(0xc0003e12a0), (*int)(nil), (*int)(nil), (*int)(nil), (*int)(0xc0003e12b0), (*int)(nil), (*int)(0xc0003e1200), (*int)(nil), (*int)(nil), (*int)(nil)}, Unallocated:[]int{51, 61, 52, 25, 59, 35, 49, 56, 48, 50, 12, 33, 3, 5, 24, 11, 18, 2, 19, 6, 20, 16, 14, 28, 40, 41, 42, 55, 43, 46, 57, 44, 39, 62, 32, 38, 63}, Attributes:[]v3.AllocationAttribute{v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e1e0), AttrSecondary:map[string]string{"node":"imsdev", "type":"vxlanTunnelAddress"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e230), AttrSecondary:map[string]string{"namespace":"kube-system", "node":"imsdev", "pod":"hostpath-provisioner-5c65fbdb4f-n5b6q"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e2a0), AttrSecondary:map[string]string{"namespace":"kube-system", "node":"imsdev", "pod":"calico-kube-controllers-847c8c99d-vn9tq"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e310), AttrSecondary:map[string]string{"namespace":"ims1", "node":"imsdev", "pod":"scscf-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e380), AttrSecondary:map[string]string{"namespace":"ims1", "node":"imsdev", "pod":"pcscf-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e3f0), AttrSecondary:map[string]string{"namespace":"controller-bbd82afa-8246-4004-8ae4-7c865129e0c2", "node":"imsdev", "pod":"modeloperator-7f85946d4-z9wwk"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e460), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"modeloperator-7f9967fb56-vbtjj"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e4d0), AttrSecondary:map[string]string{"namespace":"ims1", "node":"imsdev", "pod":"dns-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e540), AttrSecondary:map[string]string{"namespace":"controller-imsmicro", "node":"imsdev", "pod":"modeloperator-5dffd95c85-dvk2n"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e5b0), AttrSecondary:map[string]string{"namespace":"controller-bbd82afa-8246-4004-8ae4-7c865129e0c2", "node":"imsdev", "pod":"controller-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e620), AttrSecondary:map[string]string{"namespace":"metallb-system", "node":"imsdev", "pod":"controller-559b68bfd8-5hhlk"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e690), AttrSecondary:map[string]string{"namespace":"controller-imsmicro", "node":"imsdev", "pod":"controller-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e700), AttrSecondary:map[string]string{"namespace":"ims1", "node":"imsdev", "pod":"icscf-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e770), AttrSecondary:map[string]string{"namespace":"ims1", "node":"imsdev", "pod":"modeloperator-d565694b7-jhqq4"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e7e0), AttrSecondary:map[string]string{"namespace":"kube-system", "node":"imsdev", "pod":"coredns-86f78bb79c-q86wm"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e850), AttrSecondary:map[string]string{"namespace":"kube-system", "node":"imsdev", "pod":"tiller-deploy-575fcb6dfd-ljj4v"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e930), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"hss-operator-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027e9a0), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"icscf-operator-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027ea10), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"mysql-operator-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027ea80), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"hss-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027eaf0), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"pcscf-operator-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027eb60), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"icscf-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027ebd0), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"scscf-operator-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027ec40), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"mysql-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027ecb0), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"pcscf-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027ed20), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"scscf-0"}}, v3.AllocationAttribute{AttrPrimary:(*string)(0xc00027ed90), AttrSecondary:map[string]string{"namespace":"ims2", "node":"imsdev", "pod":"dns-0"}}}, Deleted:false}} error=context deadline exceeded
Oct 05 13:30:09 imsdev microk8s.daemon-containerd[31205]: 2020-10-05 13:30:09.167 [ERROR][9700] ipam.go 1238: Error updating block '10.1.90.192/26': context deadline exceeded cidr=10.1.90.192/26 handle="k8s-pod-network.c5317a7ae8fc73bdce2d8ff68c16d45f90c822d28522e8cb502e5cca65e7d0de"
Oct 05 13:30:09 imsdev microk8s.daemon-containerd[31205]: 2020-10-05 13:30:09.167 [ERROR][9700] ipam_plugin.go 309: Failed to release address ContainerID="c5317a7ae8fc73bdce2d8ff68c16d45f90c822d28522e8cb502e5cca65e7d0de" HandleID="k8s-pod-network.c5317a7ae8fc73bdce2d8ff68c16d45f90c822d28522e8cb502e5cca65e7d0de" Workload="imsdev-k8s-dns--operator--0-eth0" error=context deadline exceeded
Oct 05 13:30:09 imsdev microk8s.daemon-containerd[31205]: 2020-10-05 13:30:09.171 [ERROR][9690] utils.go 223: context deadline exceeded ContainerID="c5317a7ae8fc73bdce2d8ff68c16d45f90c822d28522e8cb502e5cca65e7d0de"
Oct 05 13:30:09 imsdev microk8s.daemon-containerd[31205]: time="2020-10-05T13:30:09.176686468Z" level=error msg="StopPodSandbox for \"c5317a7ae8fc73bdce2d8ff68c16d45f90c822d28522e8cb502e5cca65e7d0de\" failed" error="failed to destroy network for sandbox \"c5317a7ae8fc73bdce2d8ff68c16d45f90c822d28522e8cb502e5cca65e7d0de\": context deadline exceeded"

On the API server side we see the failed call to "admission.juju.is" to webhook. This webhook is supposed to intercept the REST API call and authorize it. However the pod of the webhook is hosted in the cluster so its IP is not correct so it cannot be found.

Oct 05 13:30:09 imsdev microk8s.daemon-apiserver[9995]: I1005 13:30:09.166278    9995 trace.go:205] Trace[700890020]: "Update" url:/apis/crd.projectcalico.org/v1/ipamblocks/10-1-90-192-26,user-agent:Go-http-client/2.0,client:10.45.28.23 (05-Oct-2020 13:29:35.164) (total time: 34001ms):
Oct 05 13:30:09 imsdev microk8s.daemon-apiserver[9995]: Trace[700890020]: [34.001614426s] [34.001614426s] END
Oct 05 13:30:10 imsdev microk8s.daemon-apiserver[9995]: I1005 13:30:10.189575    9995 trace.go:205] Trace[613559678]: "Call mutating webhook" configuration:juju-model-admission-controller-imsmicro,webhook:admission.juju.is,resource:crd.projectcalico.org/v1, Resource=ipamblocks,subresource:,operation:UPDATE,UID:a7df1e01-a19e-4c0b-8238-22ebaca9e472 (05-Oct-2020 13:30:06.191) (total time: 3998ms):
Oct 05 13:30:10 imsdev microk8s.daemon-apiserver[9995]: Trace[613559678]: [3.998434049s] [3.998434049s] END
Oct 05 13:30:10 imsdev microk8s.daemon-apiserver[9995]: W1005 13:30:10.189656    9995 dispatcher.go:170] Failed calling webhook, failing open admission.juju.is: failed calling webhook "admission.juju.is": Post "https://modeloperator.controller-imsmicro.svc:17071/k8s/admission/c589b8d8-c5fa-46aa-888a-ea1197c1ac82?timeout=4s": context deadline exceeded
Oct 05 13:30:10 imsdev microk8s.daemon-apiserver[9995]: E1005 13:30:10.189702    9995 dispatcher.go:171] failed calling webhook "admission.juju.is": Post "https://modeloperator.controller-imsmicro.svc:17071/k8s/admission/c589b8d8-c5fa-46aa-888a-ea1197c1ac82?timeout=4s": context deadline exceeded
Oct 05 13:30:10 imsdev microk8s.daemon-apiserver[9995]: I1005 13:30:10.189922    9995 trace.go:205] Trace[1120710352]: "GuaranteedUpdate etcd3" type:*unstructured.Unstructured (05-Oct-2020 13:29:36.189) (total time: 34000ms):
Oct 05 13:30:10 imsdev microk8s.daemon-apiserver[9995]: Trace[1120710352]: [34.00001823s] [34.00001823s] END
Oct 05 13:30:10 imsdev microk8s.daemon-apiserver[9995]: E1005 13:30:10.189989    9995 status.go:71] apiserver received an error that is not an metav1.Status: context.deadlineExceededError{}

We have a bug opened for this at https://bugs.launchpad.net/juju/+bug/1898718

As a temporary workaround you could use Juju 2.7 until this gets addressed.

tlm · 2020-10-12T02:59:22Z

@ktsakalozos Currently working on a fix for Juju to resolve this. Have updated lp bug.

tlm · 2020-10-14T03:47:01Z

Fixed committed in Juju. Will be available in 2.8.6

ArseniiPetrovich · 2021-07-29T11:47:54Z

@ktsakalozos still facing the same issue, what might be the case?

arnitkun · 2021-10-20T20:39:21Z

Issue still exists in 1.22.2, any ideas on how to fix this besides resetting/reinstalling?

ktsakalozos · 2021-10-21T09:21:55Z

@arnitkun could you please attach a microk8s inspect tarball?

arnitkun · 2021-10-21T09:53:41Z

@ktsakalozos I shall do it the next time it happens, apparently after another restart everything was good again.

madushan1026 · 2021-11-28T07:59:37Z

HI all,

I also got same issue. but check checking inspection report i found
The connection to the server 127.0.0.1:16443 was refused - did you specify the right host or port?
on the k8s folder, sometimes kubectl gives same error randomly. I could not think any reason why is that. any tips to check further?

inspection-report.zip

fzyzcjy · 2022-07-01T22:01:02Z

Not sure whether related: #3293

maohieng · 2022-07-17T04:47:45Z

I don't have juju installed in my machine. After reboot, all the pods status is unknown. Is there any workaround?

microk8s-inspection-report-20220717_074546.tar.gz

javad87 · 2022-11-07T16:24:59Z

my pods also went to unknown mode, I guess after rebooting this happened for both of my servers which has microk8s and centOS7, I post issue here also:

#3545
@ktsakalozos
Any Idea how can I resolve it? don't think it is related to docker or containerd or kernel version since it was working perfectly for 7-8 monthes!

ktsakalozos · 2022-11-07T16:28:44Z

Any Idea how can I resolve it? don't think it is related to docker or containerd or kernel version since it was working perfectly for 7-8 monthes!

Is it possible that the reboot caused the system to start from another kernel?

javad87 · 2022-11-08T09:23:45Z

Any Idea how can I resolve it? don't think it is related to docker or containerd or kernel version since it was working perfectly for 7-8 monthes!

Is it possible that the reboot caused the system to start from another kernel?

tnx for ur reply:
No, I don't think so, since there is only one kernel on both of my Centos 7 severs which is kernel-3.10.0-1160.el7.x86_64 ; one server is connected to the internet and another is completely isolated and I have done nothing on it and after reboot unexpectedly this error happened!
don't guess it is related to kernel and containerd version incompatibility due to runc vulnerability measures ...the reaseon I think so is that my docker on the same machine that microk8s (or better say containerd that running inside microk8s ) gives the error: "can't copy bootstrap data to pipe: write init-p: broken pipe" can create container e.g :

#docker run -itd busybox:latest
#docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
7328b5736817 busybox:latest "sh" 3 seconds ago Up 3 seconds flamboyant_sanderson

the environement is the same, so if container gets created with docker on the same machine with 3.xx kernel and docker-ce and containerd version as following then we can eliminate the solution that says: "with upgrade of kernel version or downgrade docker & containerd version the problem will get solved!"
I think this problem is related to microk8s and containerd inside it, deams like this issue is for Ubuntu as well:
#531

#uname -sr
Linux 3.10.0-1160.49.1.el7.x86_64
#docker version
Client:
Version: 18.09.0
API version: 1.39
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:48:22 2018
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 18.09.0
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:19:08 2018
OS/Arch: linux/amd64
Experimental: false
#yum info containerd
Installed Packages
Name : containerd.io
Arch : x86_64
Version : 1.6.9
Release : 3.1.el7
Size : 112 M
Repo : installed
From repo : docker-ce-stable
Summary : An industry-standard container runtime
URL : https://containerd.io

#rpm -qa kernel
kernel-3.10.0-1160.el7.x86_64
kernel-3.10.0-1160.49.1.el7.x86_64

#rpm -qa | grep -i kernel
kernel-tools-libs-3.10.0-1160.49.1.el7.x86_64
kernel-tools-3.10.0-1160.49.1.el7.x86_64
kernel-3.10.0-1160.el7.x86_64
kernel-headers-3.10.0-1160.49.1.el7.x86_64
kernel-3.10.0-1160.49.1.el7.x86_64

stale · 2023-10-04T21:59:01Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

bala4rtraining · 2024-09-19T01:23:48Z

Hi, We are using microk8s 1.19, with Single Node Cluster, so when the time is adjusted by networktimeprotocol automatically when moved backward for 1 hour+, then we rebooted this single node cluster PC, all cluster pods + applications pods went to unknow state, the cluster pods are hang/corrupted/unusable. Could you please resolve this issue, even if we move time 1 hour back manually as well, the Pods went to unknow state. Attached are microk8s inspect logs. Please let me know if you need any further info, the productions servers are unable due to this unknow state issue. Once the time is moved forward either if we waited or moved manually it is working. Please investigate and resolve this. If in customer location if we installed networktimeprotocol then we are getting this issue. NOTE : FYI...Certificate are valid & not expired.

ERROR : inspection-report/snap.microk8s.daemon-kubelet/systemctl.log:Sep 18 09:21:39 host-pc microk8s.daemon-kubelet[8291]: E0918 09:21:39.493660 8291 pod_workers.go:191] Error syncing pod 1785b49a-4dc2-4b99-b75d-6a181d22322e ("service-0_default(1785b49a-4dc2-4b99-b75d-6a181d22322e)"), skipping: failed to "CreatePodSandbox" for "service-0_default(1785b49a-4dc2-4b99-b75d-6a181d22322e)" with CreatePodSandboxError: "CreatePodSandbox for pod "service-0_default(1785b49a-4dc2-4b99-b75d-6a181d22322e)" failed: rpc error: code = Unknown desc = failed to reserve sandbox name "service-0_default_1785b49a-4dc2-4b99-b75d-6a181d22322e_5": name "service-0_default_1785b49a-4dc2-4b99-b75d-6a181d22322e_5" is reserved for "95beab4145623a3cc0d86c814da1e5cca4593997990245f8c626f0fc87c6c788""

inspection-report.zip

ktsakalozos mentioned this issue Mar 24, 2021

Cannot enable knative #2118

Closed

stale bot added the inactive label Oct 4, 2023

stale bot closed this as completed Nov 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pods stuck in unknown state after reboot #1520

Pods stuck in unknown state after reboot #1520

davigar15 commented Aug 27, 2020

zar3bski commented Sep 8, 2020

ktsakalozos commented Sep 9, 2020

ktsakalozos commented Sep 9, 2020

zar3bski commented Sep 17, 2020

PreethikaP commented Oct 5, 2020

ktsakalozos commented Oct 6, 2020

tlm commented Oct 12, 2020

tlm commented Oct 14, 2020

ArseniiPetrovich commented Jul 29, 2021

arnitkun commented Oct 20, 2021

ktsakalozos commented Oct 21, 2021

arnitkun commented Oct 21, 2021

madushan1026 commented Nov 28, 2021 •

edited

Loading

fzyzcjy commented Jul 1, 2022

maohieng commented Jul 17, 2022 •

edited

Loading

javad87 commented Nov 7, 2022

ktsakalozos commented Nov 7, 2022

javad87 commented Nov 8, 2022

stale bot commented Oct 4, 2023

bala4rtraining commented Sep 19, 2024 •

edited

Loading

Pods stuck in unknown state after reboot #1520

Pods stuck in unknown state after reboot #1520

Comments

davigar15 commented Aug 27, 2020

zar3bski commented Sep 8, 2020

ktsakalozos commented Sep 9, 2020

ktsakalozos commented Sep 9, 2020

zar3bski commented Sep 17, 2020

PreethikaP commented Oct 5, 2020

ktsakalozos commented Oct 6, 2020

tlm commented Oct 12, 2020

tlm commented Oct 14, 2020

ArseniiPetrovich commented Jul 29, 2021

arnitkun commented Oct 20, 2021

ktsakalozos commented Oct 21, 2021

arnitkun commented Oct 21, 2021

madushan1026 commented Nov 28, 2021 • edited Loading

fzyzcjy commented Jul 1, 2022

maohieng commented Jul 17, 2022 • edited Loading

javad87 commented Nov 7, 2022

ktsakalozos commented Nov 7, 2022

javad87 commented Nov 8, 2022

stale bot commented Oct 4, 2023

bala4rtraining commented Sep 19, 2024 • edited Loading

madushan1026 commented Nov 28, 2021 •

edited

Loading

maohieng commented Jul 17, 2022 •

edited

Loading

bala4rtraining commented Sep 19, 2024 •

edited

Loading