You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2020-01-31 15:08:22.142366 going to run tests sequentially
2020-01-31 15:08:22.142422 using the SingleClusterIdAllocator
snapshot_test (concurrent_schema_changes_test.TestConcurrentSchemaChanges) ... 2020-01-31 15:08:22.143590 cluster ccm directory: /home/kbraun/.dtest/dtest-2k9Sq9
2020-01-31 15:08:22.143644 Starting Scylla cluster from directory /home/kbraun/dev/scylla-dtest/../scylla/build/dev
2020-01-31 15:08:22.189404 snapshot_test()
ERROR
2020-01-31 15:08:28.350926 Test failed with exception: <class 'cassandra.cluster.NoHostAvailable'>: ('Unable to connect to any servers', {'127.0.0.1:9042': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
2020-01-31 15:08:28.354925 stopping ccm cluster test at: /home/kbraun/.dtest/dtest-2k9Sq9
======================================================================
ERROR: snapshot_test (concurrent_schema_changes_test.TestConcurrentSchemaChanges)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/kbraun/dev/scylla-dtest/concurrent_schema_changes_test.py", line 499, in snapshot_test
session = self.cql_connection(node1)
File "/home/kbraun/dev/scylla-dtest/dtest.py", line 704, in cql_connection
protocol_version, port=port, ssl_opts=ssl_opts, **kwargs)
File "/home/kbraun/dev/scylla-dtest/dtest.py", line 742, in _create_session
session = cluster.connect()
File "cassandra/cluster.py", line 1335, in cassandra.cluster.Cluster.connect
with self._lock:
File "cassandra/cluster.py", line 1371, in cassandra.cluster.Cluster.connect
raise
File "cassandra/cluster.py", line 1358, in cassandra.cluster.Cluster.connect
self.control_connection.connect()
File "cassandra/cluster.py", line 2896, in cassandra.cluster.ControlConnection.connect
self._set_new_connection(self._reconnect_internal())
File "cassandra/cluster.py", line 2939, in cassandra.cluster.ControlConnection._reconnect_internal
raise NoHostAvailable("Unable to connect to any servers", errors)
NoHostAvailable: ('Unable to connect to any servers', {'127.0.0.1:9042': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
-------------------- >> begin captured logging << --------------------
dtest: DEBUG: - going to run tests sequentially
dtest: DEBUG: - using the SingleClusterIdAllocator
dtest: DEBUG: concurrent_schema_changes_test.TestConcurrentSchemaChanges.snapshot_testsnapshot_test - cluster ccm directory: /home/kbraun/.dtest/dtest-2k9Sq9
dtest: DEBUG: concurrent_schema_changes_test.TestConcurrentSchemaChanges.snapshot_testsnapshot_test - Starting Scylla cluster from directory /home/kbraun/dev/scylla-dtest/../scylla/build/dev
dtest: DEBUG: concurrent_schema_changes_test.TestConcurrentSchemaChanges.snapshot_testsnapshot_test - snapshot_test()
cassandra.cluster: WARNING: Cluster.__init__ called with contact_points specified, but load-balancing policies are not specified in some ExecutionProfiles. In the next major version, this will raise an error; please specify a load-balancing policy. (contact_points = ['127.0.0.1'], EPs without explicit LBPs = ('EXEC_P
ROFILE_DEFAULT',))
cassandra.cluster: WARNING: [control connection] Error connecting to 127.0.0.1:9042:
Traceback (most recent call last):
File "cassandra/cluster.py", line 2928, in cassandra.cluster.ControlConnection._reconnect_internal
return self._try_connect(host)
File "cassandra/cluster.py", line 2950, in cassandra.cluster.ControlConnection._try_connect
connection = self._cluster.connection_factory(host.endpoint, is_control_connection=True)
File "cassandra/cluster.py", line 1292, in cassandra.cluster.Cluster.connection_factory
return self.connection_class.factory(endpoint, self.connect_timeout, *args, **kwargs)
File "cassandra/connection.py", line 471, in cassandra.connection.Connection.factory
conn = cls(endpoint, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/cassandra/io/libevreactor.py", line 267, in __init__
self._connect_socket()
File "cassandra/connection.py", line 514, in cassandra.connection.Connection._connect_socket
raise socket.error(sockerr.errno, "Tried connecting to %s. Last error: %s" % ([a[4] for a in addresses], sockerr.strerror or sockerr))
error: [Errno 111] Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused
cassandra.cluster: ERROR: Control connection failed to connect, shutting down Cluster:
Traceback (most recent call last):
File "cassandra/cluster.py", line 1358, in cassandra.cluster.Cluster.connect
self.control_connection.connect()
File "cassandra/cluster.py", line 2896, in cassandra.cluster.ControlConnection.connect
self._set_new_connection(self._reconnect_internal())
File "cassandra/cluster.py", line 2939, in cassandra.cluster.ControlConnection._reconnect_internal
raise NoHostAvailable("Unable to connect to any servers", errors)
NoHostAvailable: ('Unable to connect to any servers', {'127.0.0.1:9042': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
--------------------- >> end captured logging << ---------------------
----------------------------------------------------------------------
Ran 1 test in 7.293s
FAILED (errors=1)
By reading the code and debugging, I arrived at the conclusion that the reason is 83bdc2d, specifically these lines:
- if no_wait and not verbose:
- # waiting 2 seconds to check for early errors and for the
- # pid to be set
- time.sleep(2)
- else:
- for node, p, mark in started:
- start_message = "Starting listening for CQL clients"
- try:
- # updated code, scylla starts CQL only by default
- # process should not be checked for scylla as the
- # process is a boot script (that ends after boot)
- node.watch_log_for(start_message, timeout=600,
- verbose=verbose, from_mark=mark)
- except RuntimeError:
- raise Exception("Not able to find start "
- "message '%s' in Node '%s'" %
- (start_message, node.name))
the else branch was causing the wait in older ccm versions (which caused start() to take ~25 seconds longer to finish than it currently does). I guess this is now responsible for the wait:
if wait_for_binary_proto:
for node, _, mark in started:
node.watch_log_for("Starting listening for CQL clients",
verbose=verbose, from_mark=mark)
but
waiting for binary proto should probably be different/something more than just waiting for a certain message appearing in the node's logs (see wait_for_binary_interface in ccmlib/node.py)
lots of dtests do cluster.populate(n).start() (so wait_for_binary_proto=False) and probably assume that the cluster is ready after this:
and in effect some dtests are failing.
e.g. with the following setup: scylla-ccm d1e62ba (next and master at the moment of writing this issue) + scylla-dtest https://github.com/scylladb/scylla-dtest/commit/cd83fd8f8956befd1af256b91dee6456093413c0 (next and master at the moment of writing this issue)
running:
gives:
Here's how the test starts:
By reading the code and debugging, I arrived at the conclusion that the reason is 83bdc2d, specifically these lines:
the
else
branch was causing the wait in older ccm versions (which causedstart()
to take ~25 seconds longer to finish than it currently does). I guess this is now responsible for the wait:but
wait_for_binary_interface
in ccmlib/node.py)cluster.populate(n).start()
(sowait_for_binary_proto=False
) and probably assume that the cluster is ready after this:The text was updated successfully, but these errors were encountered: