Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manager package download stuck endless #9101

Closed
2 tasks
juliayakovlev opened this issue Oct 31, 2024 · 2 comments
Closed
2 tasks

Manager package download stuck endless #9101

juliayakovlev opened this issue Oct 31, 2024 · 2 comments
Assignees

Comments

@juliayakovlev
Copy link
Contributor

Packages

Scylla version: 6.3.0~dev-20241025.e7d6ab576bc7 with build-id d6538cb390183d94e738de97f6d148fa1f6a0b98

Kernel Version: 6.8.0-1017-aws

Issue description

  • This issue is a regression.
  • It is unknown if this issue is a regression.

A problem during manager installation.
4 nodes where installed and configured OK, but 5th failed on manager package download:

< t:2024-10-30 03:43:21,615 f:remote_base.py  l:521  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.6.79>: Running command "sudo gpg --homedir /tmp --no-default-keyring --keyring /etc/apt/keyrings/scylladb.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 491C93B9DE7496A7"...
< t:2024-10-30 03:50:50,949 f:base.py         l:231  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.6.79>: gpg: keyserver receive failed: End of file
< t:2024-10-30 03:50:50,949 f:base.py         l:147  c:RemoteLibSSH2CmdRunner p:ERROR > <10.4.6.79>: Error executing command: "sudo gpg --homedir /tmp --no-default-keyring --keyring /etc/apt/keyrings/scylladb.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 491C93B9DE7496A7"; Exit status: 2
< t:2024-10-30 03:50:50,949 f:decorators.py   l:74   c:sdcm.utils.decorators p:DEBUG > '_run': failed with '<UnexpectedExit: cmd='sudo gpg --homedir /tmp --no-default-keyring --keyring /etc/apt/keyrings/scylladb.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 491C93B9DE7496A7' exited=2>', retrying [#0]

And when retrying - stuck and does not advance:

< t:2024-10-30 03:50:55,955 f:remote_base.py  l:521  c:RemoteLibSSH2CmdRunner p:DEBUG > <10.4.6.79>: Running command "sudo gpg --homedir /tmp --no-default-keyring --keyring /etc/apt/keyrings/scylladb.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 491C93B9DE7496A7"...
< t:2024-10-30 03:50:56,437 f:db_log_reader.py l:125  c:sdcm.db_log_reader   p:DEBUG > 2024-10-30T03:50:56.425+00:00 perf-latency-nemesis-ubuntu-db-node-3cb19e38-5   !NOTICE | sudo[5719]: scyllaadm : PWD=/home/scyllaadm ; USER=root ; COMMAND=/usr/bin/gpg --homedir /tmp --no-default-keyring --keyring /etc/apt/keyrings/scylladb.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 491C93B9DE7496A7

Maybe we need to set timeout for the call

self.remoter.sudo(f"gpg --homedir /tmp --no-default-keyring --keyring /etc/apt/keyrings/scylladb.gpg "

Impact

Describe the impact this issue causes to the user.

How frequently does it reproduce?

Describe the frequency with how this issue can be reproduced.

Installation details

Cluster size: 3 nodes (i3en.2xlarge)

Scylla Nodes used in this run:

  • perf-latency-nemesis-ubuntu-db-node-3cb19e38-5 (34.245.141.107 | 10.4.6.79) (shards: -1)
  • perf-latency-nemesis-ubuntu-db-node-3cb19e38-4 (3.253.111.118 | 10.4.6.173) (shards: 7)
  • perf-latency-nemesis-ubuntu-db-node-3cb19e38-3 (34.247.52.34 | 10.4.4.57) (shards: 7)
  • perf-latency-nemesis-ubuntu-db-node-3cb19e38-2 (3.249.87.58 | 10.4.4.254) (shards: 7)
  • perf-latency-nemesis-ubuntu-db-node-3cb19e38-1 (54.216.91.17 | 10.4.4.119) (shards: 7)

OS / Image: ami-0ae43759689e5e8e8 (aws: undefined_region)

Test: scylla-master-perf-regression-latency-650gb-with-nemesis
Test id: 3cb19e38-4a56-4c8d-8b25-e18ce6fc1783
Test name: scylla-master/perf-regression/scylla-master-perf-regression-latency-650gb-with-nemesis
Test method: performance_regression_test.PerformanceRegressionTest.test_latency_write_with_nemesis
Test config file(s):

Logs and commands
  • Restore Monitor Stack command: $ hydra investigate show-monitor 3cb19e38-4a56-4c8d-8b25-e18ce6fc1783
  • Restore monitor on AWS instance using Jenkins job
  • Show all stored logs command: $ hydra investigate show-logs 3cb19e38-4a56-4c8d-8b25-e18ce6fc1783

Logs:

Jenkins job URL
Argus

@soyacz
Copy link
Contributor

soyacz commented Nov 12, 2024

possibly duplicate of #9017

@fruch
Copy link
Contributor

fruch commented Nov 13, 2024

@soyacz would add a timeout as part of f #9017

closing as duplicate

@fruch fruch closed this as completed Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants