-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow IO cause stateroot not bind mounted #3109
Comments
Thanks for the report. I assume your storage must be really slow to get this...is this really the only thing that fails? It looks to me like in current git main systemd basically hardcodes the default (compiled in) timeout when running generators: https://github.com/systemd/systemd/blob/a3680a45d0356ff3ee40dcf1d697326497a3949c/src/core/manager.c#L4123 So I'm not sure there's much we can do here; you'll need to file this with systemd; so closing here. (But without prejudice, feel free to reopen if you disagree) Probably in cases like this they may need to tune down parallelism, or support longer timeouts. It's an IMO open question whether it's better to have a system with a broken generator just fail to boot, or boot but in a potentially undefined state. |
@cgwalters Thanks for your reply. In our case, the underlaying volume i/o delay reach seconds level. Maybe we should fix the disk issue first 🤣. But in my opinion, OSTree should break the startup process when state root was not mount correctly. So we made a hack systemd service like below to break startup process and restart if state root not mounted:
|
OK hmmm...we can try to investigate this at some point. But a question: do you just not have many other generators? Or maybe the real thing we should do is change things so that |
Yes! The main issue is the stateroot mounting problem which cause the service (like containerd) wrote to the wrong place and lost the data in next reboot. Reboot just a mitigation for stateroot not being mounted. |
v2022.2
v239
What happened?
We use
ostree
andrpm-ostree
built aCoreOS-like
OS on top ofRocky
.If the disk IO very slow when boot up. Systemd generator will timeout and terminated by an internal ALRM signal.
The
var.mount
unit will not be generated. So the stateroot not be mounted.How to reproduce?
inject a latency to disk just before
Switch root
The text was updated successfully, but these errors were encountered: