-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scylla Manager Agent overwhelming Scylla Manager #3920
Comments
Since increasing the CPU limits from 1000m to 4000m, pod monitoring shows it sits around 1200m, so "overwhelming" is slightly misleading. Regardless, I'm still observing errors in the Scylla manager agent and the high bandwidth usage remains.
|
Rclone is the tool used by sm-agent to transfer the data to/from supported backup locations, so it is always there.
That's true, the location should be checked only when adding/running backup task. But in order to verify that, it would be good to collect logs with must-gather. |
scylladb/scylla-operator#1827 leaving a cross-reference here for the record.
+1. @Michal-Leszczynski feel free to transfer this to the operator repo when you have the artifacts and verify it's not a manager issue. |
@Michal-Leszczynski @rzetelskik Thanks for the reply, is there somewhere I can privately send you the output of must-gather? |
You can send it via email to [email protected] |
@rwnd-bradley I looked at the logs and it seems like this is indeed an example of scylladb/scylla-operator#1827. Let's close this issue as a dup of scylladb/scylla-operator#1827 and watch its progress there. |
Hi, I'm using the Scylla helm stack and have observed 10 requests per second per Scylla node to my nginx ingress, which is sitting in front of a Ceph S3 cluster. Based on the logs, it appears that the Scylla manager agent is constantly checking the availability of the S3 location, to the extent that it's achieving 25MB/s. Connections to
192.168.138.32
(scylla-manager-7b74b68c49-778ts
pod) also frequently drop, likely due to the Scylla manager pod CPU maxing out.I also notice "rclone" in the logs while I have only configured S3 in the
scylla-manager-agent.yaml
, defined in thescylla-manager-agent-config
Scylla manager secret.Scylla scylla-manager-agent pod:
Nginx Ingress:
Is this intended behavior? I find it hard to believe that the location needs to be checked so frequently when my backup task is scheduled once every 4 hours.
scylla version: 6.0.1
scylla agent version: 3.3.0
scylla manager version: 3.3.0
scylla operator version: 1.13
The text was updated successfully, but these errors were encountered: