You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are situations when ES will refuse to drain a given node (usually allocation constraints like max. number of shards per index and node). This will cause ES Operator to wait indefinitely for the draining to finish. At some point the scale-down event gets superseded by a scale-up event.
This should lead to the previously "to-be-drained" node to be used again.
Actual Behavior
What happens instead is that the IP stays in the cluster.routing.allocation.exclude._ip and the scale-up event only causes the statefulset to be updated, spawning new nodes. This leaves the node in a commissioned but unused state.
Steps to Reproduce the Problem
Create a cluster with two nodes (minReplicas=1, maxReplicas=2, minIndexReplicas=0), add one index with two shards, no replicas and "routing.allocation.total_shards_per_node: 1"
Wait for es-operator to start draining the second node, which will fail as ES rejects more than one shard of that same index onto the same node
Trigger a scale-out event by putting some CPU load onto ES.
Check :9200/_cluster/settings to see the IP being still in there.
Specifications
Version: latest
Platform: any
Subsystem: any
The text was updated successfully, but these errors were encountered:
Expected Behavior
There are situations when ES will refuse to drain a given node (usually allocation constraints like max. number of shards per index and node). This will cause ES Operator to wait indefinitely for the draining to finish. At some point the scale-down event gets superseded by a scale-up event.
This should lead to the previously "to-be-drained" node to be used again.
Actual Behavior
What happens instead is that the IP stays in the
cluster.routing.allocation.exclude._ip
and the scale-up event only causes the statefulset to be updated, spawning new nodes. This leaves the node in a commissioned but unused state.Steps to Reproduce the Problem
:9200/_cluster/settings
to see the IP being still in there.Specifications
The text was updated successfully, but these errors were encountered: