-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Massive drain on CPU and disk I/O on container launch #116
Comments
Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid. |
It's done to prevent a different issue where the following chown operation can take longer than 20 minutes due to an overlayfs bug. |
On my machine, this often causes mv to take over 10 minutes. |
Doing a |
Does
Sure, my concern is about -if [[ -d "${PY_LOCAL_PATH}.bak" ]]; then
+if [[ -d "${PY_LOCAL_PATH}" ]]; then
echo "**** New container detected, fixing python package permissions. This may take a while. ****"
- mv "${PY_LOCAL_PATH}.bak" "${PY_LOCAL_PATH}"
chown -R abc:abc "${PY_LOCAL_PATH}"
fi ? |
It doesn't. Not for our ci test either. https://ci-tests.linuxserver.io/linuxserver/homeassistant/latest/index.html (it takes 4 seconds for the move, and a split second for the chown). The chown without the move triggers the overlayfs cow bug, which results in the chown taking 20 minutes or longer. The move is an ugly hack to bypass the overlayfs bug. It's the best solution we have to ensure HA can run as a non-root user and still install python packages runtime. |
For another data point it takes just under 2 minutes on my Pi to complete the move + chown:
The problem is it's impossible to detect the overlayfs bug ahead of time, so either we give everyone a slight delay on first run or we give a subset of users a 20+ minute startup delay. Really this is all because HA decided to randomly switch from using pip to using uv for installing packages and the latter can't do user installs properly. |
Ok, no worries, and thanks for the replies! I will keep this patch for myself then, as I don't mind the |
@aptalca Out of curiosity, by "without the move", do you mean that you tried to do it at buildtime instead of runtime, and it triggered the bug? I'm wondering if me not having a bug (despite not doing the move at runtime) is worrisome or if it makes sense :P |
You're focusing on the wrong thing. We intentionally move/rename the folder to .bak during build time so we can move/rename it back runtime to avoid the overlayfs bug triggered by chown. Not everyone is affected by the bug, but many are (including some of our ci builders). We haven't been able to identify a common denominator for the devices affected by the bug. We just know that it's very common and we know that a move operation prior to chown prevents it. |
@aptalca I guess I could have been clearer, my bad. I understand that you started with not doing any move operations and it was bad, but I am wondering if you tried to do the 2 move operations at buildtime, like what I'm basically doing now. As you mentioned, there would still be a need to chown at runtime though. |
The overlayfs bug is a copy-on-write issue. Making changes to the container filesystem permissions at runtime causes a COW operation to write the changed file metadata to the host overlayfs storage and that's what causes the bug. For whatever reason a move operation does not trigger the same issue. It doesn't matter what you do at build time, if you change permissions on the container filesystem at runtime you can run into the bug. |
You're not following what I'm trying to tell you.
You're trying to find a way to avoid a runtime move. What I'm trying to tell you is that we WANT a move operation runtime to prevent the chown bug. |
Looks like I'm misunderstanding something, so I won't push further. My patch works for me, i'm happy :P |
Is there an existing issue for this?
Current Behavior
On boot, the container moves around a ton of python libraries
docker-homeassistant/root/etc/s6-overlay/s6-rc.d/init-config-homeassistant/run
Lines 16 to 20 in 336b6de
Instead, this can be done during the build https://github.com/schklom/Mirror-workflows/blob/850796baa622217b8c65a1d729cb83c98303111e/Dockerfile#L7-L18
although it makes the image larger in size
your latest on amd64 is 508.78 MB https://hub.docker.com/layers/linuxserver/homeassistant/latest/images/sha256-153fd08a9645b2c96334b86764d7b89d557f49b8845814e982f9f8e15a4cb9ed?context=explore
whereas mine is 927.39 MB https://hub.docker.com/layers/schklom/home-assistant/latest/images/sha256-efd225b5432868268f4f7cac0514f94c1798150b0c7161f68c1412be56b3b82a?context=repo
That
mv
step is skipped on container launch on my image, and everything runs smoothly.I can submit a PR if you like?
Expected Behavior
NA
Steps To Reproduce
NA
Environment
CPU architecture
x86-64
Docker creation
Container logs
The text was updated successfully, but these errors were encountered: