-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon fails to start when inotify initialization fails (out of watches?) #944
Comments
Thanks for the report. I suspect that something bad is going on on your nodes. Specifically, I think that the rpm-ostree daemon is now unable to setup an inotify watcher because the service (or the system as a whole) is hitting a resource limit. The Another quick smoke test would be to temporarily stop/drain all the workloads on the node (especially all the ones running as root) and see whether the service starts reacting to a simple Self-note: this failure mode is quite opaque, but I did some quick code walking in gio and it seems to be compatible with a failure when calling |
Thank you so much for the great input, I think that was the reason. I stopped docker (since there is only container workload on the node) and rpm-ostreed worked again. After that I increased the inotify limits in sysctl.conf and it worked also with docker running:
However, a new problem appeared, updating is still not working. Every time the host is started, Zincati installs the newest update correctly and I can see it in This is the log starting at a fresh boot until Zincati tries to update again after a reboot. |
@aendi123 it would be helpful to see the logs from However, as you said "when I reboot", I suspect you are forcing a manual reboot instead of letting Zincati reboot the machine to finalize the update. |
I think this expected "workflow" plays back in to coreos/zincati#498 |
Oops, you are absolutely right. It works correctly when Zincati reboots the host. |
Describe the bug
The rpm-ostreed.service fails, sometimes directly on boot, sometimes it works for some minutes and then it crashes. If it runs for some minutes after booting, rollbacks work normally. Updates are done correctly by Zincati, but the new version is never activated during the next reboot and I can't see the new version anymore in rpm-ostree status after the reboot.
Following logs are produced when I try to run systemctl start rpm-ostreed after a crash:
Reproduction steps
I have done nothing special, the machine was set up about a year ago with Fedora CoreOS 33, now it runs 34. The updates were always automatically installed, it is used as a Rancher RKE node. Two other machines with the exact same config/function don't have this problem.
Expected behavior
Rpm-ostree should work.
Actual behavior
Rpm-ostree crashes on boot or after some minutes, I am unable to install updates, rollbacks work.
System details
The text was updated successfully, but these errors were encountered: