-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added WSL2 fix for kubelet #2390
Conversation
Welcome @networkop! |
Hi @networkop. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
signed the CLA |
@@ -204,6 +204,10 @@ fix_cgroup() { | |||
# See: https://man7.org/linux/man-pages/man7/cgroups.7.html | |||
local current_cgroup | |||
current_cgroup=$(grep -E '^[^:]*:([^:]*,)?cpu(,[^,:]*)?:.*' /proc/self/cgroup | cut -d: -f3) | |||
# on WSL2 systems /sys/fs/cgroup/systemd dir is missing which eventually leads to kubelet failing to start. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it WSL2 or just distros without systemd?
Are there WSL2 distros with systemd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this stage I'm only certain about WSL2. I can reword this to say "and potentially other distros without systemd"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
# on WSL2 systems /sys/fs/cgroup/systemd dir is missing which eventually leads to kubelet failing to start. | ||
# see: https://github.com/kubernetes-sigs/kind/issues/2323 | ||
# the following mkdir command is a common workaround, see: https://github.com/microsoft/WSL/issues/4189#issuecomment-758439957 | ||
mkdir -p /sys/fs/cgroup/systemd/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically I don't think this (cgroup controller heirarchy) has to be mounted at /sys/fs/cgroup
though I've not ever encountered it mounted elsewhere.
see: https://man7.org/linux/man-pages/man7/cgroups.7.html
This directory would also normally be configured like mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually already depend on it being under /sys/fs/cgroup
below though, regarding the first point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... And it's not just kind assuming this in the ecosystem https://kubernetes.slack.com/archives/C0BP8PW9G/p1627412176214900
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome, that's good to know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as for the mounting, I believe this will be done by this line
The current placement of this mkdir command is deliberate so that it gets picked up by the cgroup_subsystems
on the next line and then passed to the mount_kubelet_cgroup_root "/kubelet" "${subsystem}"
At least, this is how it works for me. I've added just this one line and this allowed for the cluster to start up and results in the following mount being created by the entrypoint
root@k8s-guide-control-plane:/# mount | grep kubelet | head -n 1
cgroup on /sys/fs/cgroup/cpuset/kubelet type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
For details see kubernetes-sigs/kind#2323 and kubernetes-sigs/kind#2390
something's not right, investigating |
I think I've finally found a stable workaround which works across reboots and docker restarts. The conclusion is that any changes done by the entrypoint to any path under So the alternative is to add an extra |
/ok-to-test |
I was able to reproduce the failed log message on WSL2. docker exec -it kind-control-plane /bin/sh -euc "if [ ! -f /sys/fs/cgroup/cgroup.controllers ] && [ ! -d /sys/fs/cgroup/systemd/kubelet ]; then mkdir -p /sys/fs/cgroup/systemd/kubelet; fi" /lgtm |
I'm taking a break from some other things I've been on and doing some catchup today |
75cc065
to
de1f245
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: networkop The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
reabsed, squashed, pushed base image. pushing node image. |
/lgtm |
/retest |
New changes are detected. LGTM label has been removed. |
/retest |
the docker rootless CI actions flake seems somewhat concerning, running it again and the prow / 1.19 CI |
|
Can someone test it using environment from #2440, to make sure it actually fixes it? Or tell me how to test it myself… |
/retest |
I'll incorporate this in the base bump #2465 for 0.12 |
rebased version merged in #2465, thank you! |
This PR fixes #2323.