Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

scx: Exit a scheduler for unhandled hotplug events #177

Merged
merged 4 commits into from
Apr 10, 2024
Merged

Conversation

Byte-Lab
Copy link
Collaborator

@Byte-Lab Byte-Lab commented Apr 9, 2024

A scheduler may implement ops.cpu_online() and ops.cpu_offline() for CPU
onlining and offlining events respectively. If a scheduler does not
implement these callbacks, it's indicative that they do not support
hotplug. Given that a scheduler that doesn't support hotplug is
essentially guaranteed to perform incorrectly if a hotplug event occurs,
let's update ext.c to do the sane thing, and exit the scheduler
automatically.

Given that there may be other events in the future that could cause the
scheduler to initiate an exit, we create a new enum scx_exit_code type
that reserves the top bit of the exit_code field in struct
scx_exit_info, and defines SCX_ECODE_RESTART. We could as an alternative
instead just return something like -EAGAIN to signify that user space
can try restarting the scheduler.

Note that this isn't a 100% foolproof, race free hotplug detection mechanism.
For some schedulers, if a CPU is hotplugged after inspecting the host
topology but before attaching the scheduler, we could run into problems.
We'll need something like a hotplug generation counter to accommodate
this. We can take care of it in a separate follow up patch set.

We currently provide scx_ops_error() as a way for ext.c to cause a
scheduler to be evicted due to erroneous behavior (for example, due to
returning an invalid CPU from ops.select_cpu()). Now that we have a
method for exiting gracefully with an exit code from BPF programs, we
can similarly provide an scx_ops_exit() macro that allows ext.c to exit
and pipe an exit code up to user space.

This patch adds that macro. A subsequent patch will use it to exit and
plumb up hotplug events.

Signed-off-by: David Vernet <[email protected]>
A scheduler may implement ops.cpu_online() and ops.cpu_offline() for CPU
onlining and offlining events respectively. If a scheduler does _not_
implement these callbacks, it's indicative that they do not support
hotplug. Given that a scheduler that doesn't support hotplug is
essentially guaranteed to perform incorrectly if a hotplug event occurs,
let's update ext.c to do the sane thing, and exit the scheduler
automatically.

Given that there may be other events in the future that could cause the
scheduler to initiate an exit, we create a new enum scx_exit_code type
that reserves the top bit of the exit_code field in struct
scx_exit_info, and defines SCX_ECODE_RESTART. We could as an alternative
instead just return something like -EAGAIN to signify that user space
can try restarting the scheduler.

Signed-off-by: David Vernet <[email protected]>
Now that we have bits reserved for system exit code reasons and actions,
as well as bits available for user by user space, let's add some
ease-of-use macro to user_exit_info.h. A subsequent patch will add
selftests that use these macros.

Signed-off-by: David Vernet <[email protected]>
We've recently added some logic related to hotplug:

- If a hotplug event occurs and a scheduler hasn't implemented a
  callback for it, we automatically exit the scheduler with specific,
  built-in exit codes

- With scx_bpf_exit(), a scheduler can choose to manually exit the
  scheduler in a hotplug event, or do something else. In any case, the
  scheduler should _not_ be automatically exited by the kernel

Let's add selftests to validate these conditions.

Signed-off-by: David Vernet <[email protected]>
@arighi
Copy link
Collaborator

arighi commented Apr 10, 2024

I'm not sure if relying purely on the implementation of ops.cpu_online() / ops.cpu_offline() is ideal. For instance, scx_simple works completely fine with cpu hotplugging, but it doesn't implement these methods. I'm wondering if having a dedicated flag would give more flexibility, something like `SCX_OPS_CPU_HOTPLUG_EXIT`` or similar? In this way it's up to the sched developer to determine if the scheduler supports cpu hotplugging correctly or not, wdyt?

@Byte-Lab
Copy link
Collaborator Author

I'm not sure if relying purely on the implementation of ops.cpu_online() / ops.cpu_offline() is ideal. For instance, scx_simple works completely fine with cpu hotplugging, but it doesn't implement these methods. I'm wondering if having a dedicated flag would give more flexibility, something like `SCX_OPS_CPU_HOTPLUG_EXIT`` or similar? In this way it's up to the sched developer to determine if the scheduler supports cpu hotplugging correctly or not, wdyt?

That would indeed be more convenient for scx_simple. On the other hand, scx_simple does have the flexibility of supporting hotplug by implementing empty versions of ops.online() and ops.offline(), albeit with a bit more boilerplate than just specifying an ops flag. It's arguably a bit more confusing / leaky (in terms of the API) to have behavior depend on the implementation of callbacks, but there is a precedent for that. See e.g. ops.update_idle() and ops->flags & SCX_OPS_KEEP_BUILTIN_IDLE. We'd also have to decide what the behavior should be for a scheduler that implements ops.cpu_online() / ops.cpu_offline() but also specifies that flag. I'm guessing we would want to reject loading the scheduler? At that point, the API is arguably no more leaky than just implementing the restart behavior if the callbacks aren't specified.

I'd expect that the cases where the scheduler cares about hotplug are when they have some view of the host topology; and generally speaking I'd expect that to be the norm for most production schedulers. I think the question is whether we want to do what we think is "the right thing" in terms of scheduler correctness for the common case (meaning assuming a scheduler that doesn't implement hotplug is probably not going to do the right thing in the event of a hotplug event), or to instead have a standalone flag that completely controls the behavior. Given that I think it's probably atypical for a scheduler to be completely agnostic to hotplug changes, my inclination is to err on the side of always exiting the scheduler if the callbacks aren't defined.

Wdyt?

@arighi
Copy link
Collaborator

arighi commented Apr 10, 2024

Hm... good point about implementing an empty cpu_online / cpu_offline to maintain the scheduling active, despite the additional boilerplate code it's probably the right thing to do, and it doesn't add extra complexity to the code. And, as you pointed out, there's also the update_idle() precedent, so it makes this approach more valid. I also agree that the default behavior should be "exit on a cpu hotplug event, if the scheduler ignores cpu hotplugging events".

That said, I agree with this approach, thanks for clarifying it.

@htejun htejun merged commit 37b3f83 into sched_ext Apr 10, 2024
1 check passed
@htejun htejun deleted the hotplug_restart branch April 10, 2024 17:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants