Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(node decomission): stop a decommissioned node by default #600

Draft
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

yarongilor
Copy link

@yarongilor yarongilor commented Aug 27, 2024

A decommissioned node currently left is_running() after its
status changed to "decommissioned". thus it is unexpectedly left
as part of cluster nodes, like in cluster.nodelist().
This should not be the default, but rather to stop() the node
after decommission successfully completed (as it is in SCT).
This fix requires a corresponding dtest PR to adjust all calls of
node.decommission() in order to remove 14 un-needed calls to node.stop()
or to set the new flag of stop_node=False needed for raft testing.
This fix is a followup of:
https://github.com/scylladb/scylla-dtest/pull/4767#discussion_r1731340260

refs: https://github.com/scylladb/scylla-dtest/pull/4767#discussion_r1731340260

There are also 11 occurrences of nodetool("decommission") to be adjusted/optimized:

$ grep -Eri '\.nodetool\(.*decommission' --include \*.py .
./manager_backup_tests.py:        node3.nodetool("decommission")
./compaction_additional_test.py:                node2.nodetool("decommission")
./manager_restore_tests.py:        node3.nodetool("decommission")
./manager_restore_tests.py:        node3.nodetool("decommission")
./manager_restore_tests.py:            node.nodetool("decommission")
./update_cluster_layout_tests.py:        node1.nodetool("decommission")
./update_cluster_layout_tests.py:        node3.nodetool("decommission", capture_output=False, wait=False)
./update_cluster_layout_tests.py:        node3.nodetool("decommission", capture_output=False, wait=False)
./update_cluster_layout_tests.py:        node3.nodetool("decommission", capture_output=False, wait=False)
./nodetool_additional_test.py:        node2.nodetool("decommission")
./topology_test.py:            out, err = node.nodetool("decommission")

	A decommissioned node currently left is_running() after its
	status changed to "decommissioned". thus it is unexpectedly left
	as part of cluster nodes, like in cluster.nodelist().
	This should not be the default, but rather to stop() the node
	after decommission successfully completed (as it is in SCT).
	This fix requires a corresponding dtest PR to adjust all calls of
	node.decommission() in order to remove 14 un-needed calls to node.stop()
	or to set the new flag of stop_node=False needed for raft testing.
	This fix is a followup of:
	https://github.com/scylladb/scylla-dtest/pull/4767#discussion_r1731340260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant