Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resources required to delete a pool affect subsequent test run #117

Open
ASBishop opened this issue Sep 27, 2016 · 1 comment
Open

Resources required to delete a pool affect subsequent test run #117

ASBishop opened this issue Sep 27, 2016 · 1 comment

Comments

@ASBishop
Copy link

Many benchmarks delete and recreate the test pool between runs. However, the Ceph command to delete a pool returns immediately, and the work of deleting objects in the pool takes place in the background. Unfortunately, experience has shown that disk and CPU resources used while deleting the objects is great enough that it influences the test results for the subsequent run.

One way to avoid the problem is to have the cluster.rmpool() function wait until the disk and CPU utilization on the OSD nodes drops to a reasonable level before returning to the caller. I will be issuing a pull request with this change.

ASBishop pushed a commit to ASBishop/cbt that referenced this issue Sep 27, 2016
When deleting a pool, it may take a while for the OSD nodes to delete the
objects in the pool. This change makes CBT wait until the OSD nodes quiesce
in order to ensure they are idle before starting the next test run.

Quiescing is done by waiting until the maximum disk utilization for any
disk falls below 3% across a 30 second window, and waiting until the maximum
CPU utilization for any ceph-osd process falls below 3%.

Closes ceph#117
@bengland2
Copy link
Contributor

+1

ASBishop pushed a commit to ASBishop/cbt that referenced this issue Oct 25, 2016
When deleting a pool, it may take a while for the OSD nodes to delete the
objects in the pool. This change makes CBT wait until the OSD nodes quiesce
in order to ensure they are idle before starting the next test run.

Quiescing is done by waiting until the maximum disk utilization for any
disk falls below a threshold, and waiting until the maximum CPU utilization
for any ceph-osd process falls below a threshold. The thresholds can be
tuned using the following cluster configuration parameters (the default values
are listed):

cluster:
  quiesce_disk_util_max: 3
  quiesce_disk_window_size: 30
  quiesce_osd_cpu_max: 3

If quiesce_disk_util_max or quiesce_osd_cpu_max is zero then the corresponding
disk/CPU quiescing operation is skipped.

Closes ceph#117
ASBishop pushed a commit to ASBishop/cbt that referenced this issue Nov 3, 2016
When deleting a pool, it may take a while for the OSD nodes to delete the
objects in the pool. This change makes CBT wait until the OSD nodes quiesce
in order to ensure they are idle before starting the next test run.

Quiescing is done by waiting until the maximum disk utilization for any
disk falls below a threshold, and waiting until the maximum CPU utilization
for any ceph-osd process falls below a threshold. The thresholds can be
tuned using the following cluster configuration parameters (the default values
are listed):

cluster:
  quiesce_disk_util_max: 3
  quiesce_disk_window_size: 30
  quiesce_osd_cpu_max: 3

If quiesce_disk_util_max or quiesce_osd_cpu_max is zero then the corresponding
disk/CPU quiescing operation is skipped.

Closes ceph#117

(cherry picked from commit 3d442c7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants