You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This works great as long as all nodes are using ephemeral storage for --data-dir however n the case where this storage is persistent but the IP may not be the node with persistent storage will be evicted from the cluster when its IP changes. This is because etcd2 will not utilize the discovery service if it finds a valid wal file
To show this I setup 3 VMs. Two of these VMs were PXE booted with a RAM disk and the third booted from local disk. This third VM would persist /var/lib/etcd2. Once the cluster was setup I started rebooting the PXE nodes and confirmed that elastic-etcd was properly managing the noes within the discovery service. The non-PXE node was rebooted and would rejoin the cluster with no issue.
Next I reset the MAC address on the NIC in the non-PXE VM to force a new IP. Upon boot it would join the cluster with the new IP address but elastic-etcd did not update the discovery service. After this I setup a 4th PXE VM and booted it into the cluster. elastic-etcd detected only 2 of the 3 nodes from the discovery service and setup etcd2 to join the cluster as if there was no third node.
Upon joining the cluster the VM with the persistent storage was voted off of the island.
I feel that elastic-etcd should query live cluster members from the discovery service for nodes that are up but may not be listed in the discovery service and update the discovery service accordingly. I'm currently investigating how to query the up nodes in the discovery service to determine how/if this is possible.
The current member list can be enumerated from the members API documented here.
I'm thinking that each node in the discovery API should be queried against to ensure that:
1.All nodes that are 'up' agree on the members list; if not, throw an error else continue normally.
2. That the member list from the discovery API matches the authoritative list from above; if not, update the discovery API with the authoritative member list from the cluster members and continue execution normally.
This works great as long as all nodes are using ephemeral storage for --data-dir however n the case where this storage is persistent but the IP may not be the node with persistent storage will be evicted from the cluster when its IP changes. This is because etcd2 will not utilize the discovery service if it finds a valid wal file
To show this I setup 3 VMs. Two of these VMs were PXE booted with a RAM disk and the third booted from local disk. This third VM would persist /var/lib/etcd2. Once the cluster was setup I started rebooting the PXE nodes and confirmed that elastic-etcd was properly managing the noes within the discovery service. The non-PXE node was rebooted and would rejoin the cluster with no issue.
Next I reset the MAC address on the NIC in the non-PXE VM to force a new IP. Upon boot it would join the cluster with the new IP address but elastic-etcd did not update the discovery service. After this I setup a 4th PXE VM and booted it into the cluster. elastic-etcd detected only 2 of the 3 nodes from the discovery service and setup etcd2 to join the cluster as if there was no third node.
Upon joining the cluster the VM with the persistent storage was voted off of the island.
I feel that elastic-etcd should query live cluster members from the discovery service for nodes that are up but may not be listed in the discovery service and update the discovery service accordingly. I'm currently investigating how to query the up nodes in the discovery service to determine how/if this is possible.
For completeness here is my cloud-init:
The text was updated successfully, but these errors were encountered: