-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rbd: support QoS based on capacity for rbd volume #5016
base: devel
Are you sure you want to change the base?
Conversation
8a47010
to
c07a41a
Compare
63ea87d
to
f60f7c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! This is really a nice feature to have. A couple of comments that need to be addressed to improve the maintainability and usability.
- move all qos related functions, structs, constants to a new
qos.go
file - all new functions that receive a *rbdVolume as argument should become a function of the rbdVolume struct (see example for the
SetQOS
function) - new parameters for the StorageClass need to be added to the documentation
- documentation should explain that the QoS settings are currently only used in combination with the NBD mounter
- ideally add unit tests for functions that do not interact with a Ceph cluster
- ideally add some form or e2e testing, maybe set QoS parameters in the StorageClass, and verify that those are set on an RBD-image?
Thanks again, this is a great start!
2b895ad
to
31ce867
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
31ce867
to
2b56d76
Compare
Looks quite complete, many thanks for adding the unit and e2e tests! |
/test ci/centos/mini-e2e/k8s-1.31 |
not sure why that did not work, but the CI is running now (takes ~2 hours to complete): https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/job/mini-e2e_k8s-1.31/283/display/redirect |
Write it briefly to verify that e2e testing is available. More scenes need to be added. :-) |
4d90640
to
0c09729
Compare
47bd9ab
to
795ea78
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
bd4d17e
to
4774a82
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
1 similar comment
/test ci/centos/mini-e2e/k8s-1.31 |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions. |
docs/rbd/deploy.md
Outdated
| `BaseWriteIops` | no | the base limit of write operations per second | | ||
| `BaseReadBytesPerSecond` | no | the base limit of read bytes per second | | ||
| `BaseWriteBytesPerSecond` | no | the base limit of write bytes per second | | ||
| `ReadIopsPerGB` | no | the limit of read operations per GiB | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the key GB
is used and in the comments its mentioned GiB
is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, there is confuse, we should use unit of bytes(IEC).
docs/rbd/deploy.md
Outdated
| `WriteIopsPerGB` | no | the limit of write operations per GiB | | ||
| `ReadBpsPerGB` | no | the limit of read bytes per GiB | | ||
| `WriteBpsPerGB` | no | the limit of write bytes per GiB | | ||
| `BaseVolSizeBytes` | no | the min size of volume what use to calc qos beased on capacity | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
calc
to calculate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -73,6 +73,15 @@ make image-cephcsi | |||
| `stripeUnit` | no | stripe unit in bytes | | |||
| `stripeCount` | no | objects to stripe over before looping | | |||
| `objectSize` | no | object size in bytes | | |||
| `BaseReadIops` | no | the base limit of read operations per second | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can all these operations can be supported for krbd
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these operations can be supported for krbd
, but we need to set QoS by cgroup after get QoS limit result by calcQosBasedOnCapacity
. if we support burst QoS, krbd
can not be support.
e2e/rbd.go
Outdated
qosParameters := map[string]string{ | ||
"BaseReadIops": "2000", | ||
"BaseWriteIops": "1000", | ||
"BaseReadBytesPerSecond": "209715200", | ||
"BaseWriteBytesPerSecond": "104857600", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please define a var for numeric values as its used in two places to have single source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
|
||
pvc.Namespace = f.UniqueName | ||
pvc.Spec.Resources.Requests[v1.ResourceStorage] = resource.MustParse("100Gi") | ||
wants = map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen for cloned PVC or PVC created from snapshot, are they going to get the same values or different values?
Can you please add a test case of that as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cloned PVC or PVC created from snapshot get the same values after my testing on the ceph.
internal/rbd/qos.go
Outdated
if errors.Is(err, librbd.ErrNotFound) { | ||
// rbdQosPerGBType does not exist, set it empty. | ||
perGBLimit = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need explict setting it to empty string here , AFAIK it returns empty string in case of an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
internal/rbd/qos.go
Outdated
} else if err != nil { | ||
log.ErrorLog(ctx, "failed to get metadata: %s. %v", param.rbdQosPerGBType, err) | ||
|
||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we dont need if and else if check here. we can have single check err !=nil && !errors.Is(err, librbd.ErrNotFound)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
internal/rbd/qos.go
Outdated
if errors.Is(err, librbd.ErrNotFound) { | ||
// rbdBaseQosVolSize does not exist, set it empty. | ||
baseVolSize = "" | ||
} else if err != nil { | ||
log.ErrorLog(ctx, "failed to get metadata: %s. %v", baseQosVolSize, err) | ||
|
||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here as well we dont need two checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
internal/rbd/qos.go
Outdated
) error { | ||
rbdQosParameters, err := getRbdImageQOS(ctx, rv) | ||
if err != nil { | ||
log.ErrorLog(ctx, "get rbd image qos failed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log the error and also why we are only logging here not in other places in this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this log deleted
return rbdQosParameters | ||
} | ||
|
||
func (rv *rbdVolume) SetQOS( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can use rbImage
instead of rbdVolume
for most of the methods getting added in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any particular opinion on this point, I respect your suggestion. :)
@@ -162,6 +162,38 @@ parameters: | |||
# stripeCount: <> | |||
# (optional) The object size in bytes. | |||
# objectSize: <> | |||
|
|||
# rbd volume QoS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use https://kubernetes.io/docs/concepts/storage/volume-attributes-classes/#the-volumeattributesclass-api as these values are not immutable as anything mentioned in the SC is mutable?
4774a82
to
f42036b
Compare
Pull request has been modified.
/test ci/centos/mini-e2e/k8s-1.31 |
f42036b
to
348ef23
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
348ef23
to
25a4c0d
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
1. QoS provides settings for rbd volume read/write iops and read/write bandwidth. 2. All QoS parameters are placed in the SC, send QoS parameters from SC to Cephcsi through PVC create request. 3. We need provide QoS parameters in the SC as below: - BaseReadIops - BaseWriteIops - BaseReadBytesPerSecond - BaseWriteBytesPerSecond - ReadIopsPerGB - WriteIopsPerGB - ReadBpsPerGB - WriteBpsPerGB - BaseVolSizeBytes There are 4 base qos parameters among them, when users apply for a volume capacity equal to or less than BaseVolSizebytes, use base qos limit. For the portion of capacity exceeding BaseVolSizebytes, QoS will be increased in steps set per GB. If the step size parameter per GB is not provided, only base QoS limit will be used and not associated with capacity size. 4. If PVC has resize request, adjust the QoS limit according to the QoS parameters after resizing. Signed-off-by: Yite Gu <[email protected]>
25a4c0d
to
0c1ef98
Compare
/test ci/centos/mini-e2e/k8s-1.31 |
There are 4 base qos parameters among them, when users apply for a volume capacity equal to or less than BaseVolSizebytes, use base qos limit. For the portion of capacity exceeding BaseVolSizebytes, QoS will be increased in steps set per GB. If the step size parameter per GB is not provided, only base QoS limit will be used and not associated with capacity size.
Describe what this PR does
Provide some context for the reviewer
Is there anything that requires special attention
Do you have any questions?
Is the change backward compatible?
Are there concerns around backward compatibility?
Provide any external context for the change, if any.
For example:
Related issues
Mention any github issues relevant to this PR. Adding below line
will help to auto close the issue once the PR is merged.
Fixes: #issue_number
Future concerns
List items that are not part of the PR and do not impact it's
functionality, but are work items that can be taken up subsequently.
Checklist:
guidelines in the developer
guide.
Request
notes
updated with breaking and/or notable changes for the next major release.
Show available bot commands
These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:
/retest ci/centos/<job-name>
: retest the<job-name>
after unrelatedfailure (please report the failure too!)