-
Notifications
You must be signed in to change notification settings - Fork 686
Pods elasticsearch failed with "Back-off restarting failed container" #205
Comments
@maryxu Could you run a It would be also helpful to paste the output always in code blocks for easier reading. |
Thanks for your response! Do you mean of this? I pasted these into code blocks already.
[root@####### ~]# kubectl logs po/efk3-elasticsearch-2 --namespace=efk --insecure-skip-tls-verify=true
[2018-07-10T07:04:24,598][INFO ][o.e.n.Node ] [] initializing ...
[2018-07-10T07:04:24,687][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/efk3-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:125) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.2.4.jar:6.2.4]
at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:92) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:85) ~[elasticsearch-6.2.4.jar:6.2.4]
Caused by: java.lang.IllegalStateException: failed to obtain node locks, tried [[/usr/share/elasticsearch/data/efk3-cluster]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:244) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.node.Node.<init>(Node.java:264) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.node.Node.<init>(Node.java:246) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:323) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-6.2.4.jar:6.2.4]
... 6 more
Caused by: java.io.IOException: failed to obtain lock on /usr/share/elasticsearch/data/nodes/0
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:223) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.node.Node.<init>(Node.java:264) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.node.Node.<init>(Node.java:246) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:323) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-6.2.4.jar:6.2.4]
... 6 more
Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes/0/node.lock
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
at org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:125) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:209) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.node.Node.<init>(Node.java:264) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.node.Node.<init>(Node.java:246) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:323) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-6.2.4.jar:6.2.4]
... 6 more
Mary Xu,
From: Matthias Kneer <[email protected]>
Sent: 2018年7月10日 14:57
To: pires/kubernetes-elasticsearch-cluster <[email protected]>
Cc: Xu, Mary <[email protected]>; Mention <[email protected]>
Subject: Re: [pires/kubernetes-elasticsearch-cluster] Pods elasticsearch failed with "Back-off restarting failed container" (#205)
@maryxu<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_maryxu&d=DwMFaQ&c=8phuB5cQQHlkI2lLHrIvhpifJML7AQU49gcTMQttO8k&r=CNfej8kgPEZdHUGkjeeWopsXPs0-nOpeGddofc8Xj-g&m=AKLm6957lP8Z-9bbRTpxoNvKsvp1IQhKcLQIxWdQjmg&s=2XvtDtMGwQtrAV_Rdgsd43msMktLxoO75c9luCEMYB8&e=> Could you run a kubectl logs po/efk3-elasticsearch-2 --namespace=efk --insecure-skip-tls-verify=true and check the PODs logs to see if there are issues with the process within the container itself?
It would be also helpful to paste the output always in code blocks for easier reading.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pires_kubernetes-2Delasticsearch-2Dcluster_issues_205-23issuecomment-2D403720509&d=DwMFaQ&c=8phuB5cQQHlkI2lLHrIvhpifJML7AQU49gcTMQttO8k&r=CNfej8kgPEZdHUGkjeeWopsXPs0-nOpeGddofc8Xj-g&m=AKLm6957lP8Z-9bbRTpxoNvKsvp1IQhKcLQIxWdQjmg&s=JxvCXZ-sPvdEV3Hn5Xm8NZDETCNUZ9I62Pcx8hoNwBE&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMp74jf6B9KHnVrN-2DP-2DQkAoisUCnVH8jks5uFFAfgaJpZM4VI1c2&d=DwMFaQ&c=8phuB5cQQHlkI2lLHrIvhpifJML7AQU49gcTMQttO8k&r=CNfej8kgPEZdHUGkjeeWopsXPs0-nOpeGddofc8Xj-g&m=AKLm6957lP8Z-9bbRTpxoNvKsvp1IQhKcLQIxWdQjmg&s=mnVbYFbpGIbnAqZXtoIGafAAvyDPsIYjj9bagROs2fc&e=>.
|
It looks like a permission issue
Elasticsearch seems to not be able to write into |
I’m not quite understand. I’m using a NFS data for the PV and PVC. For the other applications, the container can write to the NFS data without no problem.
[root@k8s-test11 data]# kubectl get pv,pvc --namespace=efk --insecure-skip-tls-verify=true
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv/data-efk1-elasticsearch-0 5Gi RWO Retain Bound efk/data-efk3-elasticsearch-0 14d
pv/data-efk1-elasticsearch-1 5Gi RWO Retain Bound efk/data-efk3-elasticsearch-1 14d
pv/data-efk1-elasticsearch-2 5Gi RWO Retain Bound efk/data-efk3-elasticsearch-2 14d
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc/data-efk3-elasticsearch-0 Bound data-efk1-elasticsearch-0 5Gi RWO 14d
pvc/data-efk3-elasticsearch-1 Bound data-efk1-elasticsearch-1 5Gi RWO 14d
pvc/data-efk3-elasticsearch-2 Bound data-efk1-elasticsearch-2 5Gi RWO 14d
Mary Xu,
From: Matthias Kneer <[email protected]>
Sent: 2018年7月10日 15:21
To: pires/kubernetes-elasticsearch-cluster <[email protected]>
Cc: Xu, Mary <[email protected]>; Mention <[email protected]>
Subject: Re: [pires/kubernetes-elasticsearch-cluster] Pods elasticsearch failed with "Back-off restarting failed container" (#205)
It looks like a permission issue
Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes/0/node.lock
at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) ~[?:?]
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) ~[?:?]
at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) ~[?:?]
at java.nio.channels.FileChannel.open(FileChannel.java:287) ~[?:1.8.0_161]
at java.nio.channels.FileChannel.open(FileChannel.java:335) ~[?:1.8.0_161]
at org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:125) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) ~[lucene-core-7.2.1.jar:7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:48:43]
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:209) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.node.Node.<init>(Node.java:264) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.node.Node.<init>(Node.java:246) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:323) ~[elasticsearch-6.2.4.jar:6.2.4]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) ~[elasticsearch-6.2.4.jar:6.2.4]
... 6 more
Elasticsearch seems to not be able to write into /usr/share/elasticsearch/data/
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_pires_kubernetes-2Delasticsearch-2Dcluster_issues_205-23issuecomment-2D403725920&d=DwMFaQ&c=8phuB5cQQHlkI2lLHrIvhpifJML7AQU49gcTMQttO8k&r=CNfej8kgPEZdHUGkjeeWopsXPs0-nOpeGddofc8Xj-g&m=PxWC-nhhnUpL3Xj00nJYiaRMsDfDrBy5YMGREDyADGQ&s=UX5b268VpIyc3-7u96OLjM-uVWHk9FmiySyMQdRiS74&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AMp74l1xbRFQB5Fa4Ps-2DL7t0pJk0qPc0ks5uFFXHgaJpZM4VI1c2&d=DwMFaQ&c=8phuB5cQQHlkI2lLHrIvhpifJML7AQU49gcTMQttO8k&r=CNfej8kgPEZdHUGkjeeWopsXPs0-nOpeGddofc8Xj-g&m=PxWC-nhhnUpL3Xj00nJYiaRMsDfDrBy5YMGREDyADGQ&s=sDyuNrJ4_B4_OuQ5W9nVuknCu9mZzWVa_ciRTkPIu2k&e=>.
|
What are the other applications? Do you have multiple APPs running in the same container? Please post your kubernetes statefulset. Including the NFS volumes that are attached and mounted to the PODs. The elasticsearch container will most likely not be started as |
Can you help to guide us about why Pods elasticsearch failed with "Back-off restarting failed container"? Thanks a lot!
[root@####### ~]# kubectl describe pod efk3-elasticsearch-2 --namespace=efk --insecure-skip-tls-verify=true
Name: efk3-elasticsearch-2
Namespace: efk
Node: lvdevk8sw23/10.219.161.3
Start Time: Tue, 03 Jul 2018 14:58:28 +0800
Labels: app=elasticsearch
component=master
controller-revision-hash=efk3-elasticsearch-569cf776f
release=efk3
statefulset.kubernetes.io/pod-name=efk3-elasticsearch-2
Annotations:
Status: Running
IP: 10.42.6.19
Controlled By: StatefulSet/efk3-elasticsearch
Init Containers:
sysctl:
Container ID: docker://bed338fd0e395678abbbd1c49be7d14e1636faacdadba334c0e5607c4eb07251
Image: busybox
Image ID: docker-pullable://busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47
Port:
Command:
sysctl
-w
vm.max_map_count=262144
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 03 Jul 2018 14:58:32 +0800
Finished: Tue, 03 Jul 2018 14:58:32 +0800
Ready: True
Restart Count: 0
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from efk3-elasticsearch-token-5vqzr (ro)
Containers:
elasticsearch:
Container ID: docker://336bfce82124136d78de952552de2f2688c5f4249c9c440b1259fd7b3e230046
Image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.2.4
Image ID: docker-pullable://docker.elastic.co/elasticsearch/elasticsearch-oss@sha256:2d9c774c536bd1f64abc4993ebc96a2344404d780cbeb81a8b3b4c3807550e57
Ports: 9300/TCP, 9200/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 10 Jul 2018 13:20:21 +0800
Finished: Tue, 10 Jul 2018 13:20:26 +0800
Ready: False
Restart Count: 1929
Limits:
cpu: 1
Requests:
cpu: 25m
memory: 512Mi
Readiness: http-get http://:9200/_cluster/health%3Flocal=true delay=5s timeout=1s period=10s #success=1 #failure=3
Environment:
cluster.name: efk3-cluster
discovery.zen.ping.unicast.hosts: efk3-elasticsearch
discovery.zen.minimum_master_nodes: 2
KUBERNETES_NAMESPACE: efk (v1:metadata.namespace)
discovery.zen.ping.unicast.hosts: efk3-elasticsearch
PROCESSORS: 1 (limits.cpu)
ES_JAVA_OPTS: -Djava.net.preferIPv4Stack=true -Xms512m -Xmx512m
Mounts:
/usr/share/elasticsearch/data from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from efk3-elasticsearch-token-5vqzr (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-efk3-elasticsearch-2
ReadOnly: false
efk3-elasticsearch-token-5vqzr:
Type: Secret (a volume populated by a Secret)
SecretName: efk3-elasticsearch-token-5vqzr
Optional: false
QoS Class: Burstable
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
Normal Pulled 48m (x1921 over 6d) kubelet, lvdevk8sw23 Container image "docker.elastic.co/elasticsearch/elasticsearch-oss:6.2.4" already present on machine
Warning BackOff 3m (x44220 over 6d) kubelet, lvdevk8sw23 Back-off restarting failed container
The text was updated successfully, but these errors were encountered: