-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove infinite loop from Local Watchdog (LWD) #86
Comments
Quoting myself from #76 (comment)
Regarding point 1,
Has this happened before? The purpose of
So, ideally, In either case, we need better understanding of |
I am aware of what 'WatchFile' is doing and it is waiting for completion of sysrq-trigger requests
I had proposed 5 minutes as a starting point waiting for opinions. I felt 5 minutes is an improvement
|
The fix for bugzilla 1813935 now interrupts the infinite loop issue since the directory
did not exist, the 20_sysinfo plugin got stuck in an infinite loop. In my opinion, the
plugin should timeout after a period of time.
My reasons for this:
sending output to the log file and this file monitoring for more messages will
never quit until External Watchdog.
long a period of time.
failure has occurred. With LWD plugin, it is especially difficult since restraint
doesn't have another timer fired off to interrupt this so it's not until 1/2 hour
later when EWD kicks in to kill the job.
Solution:
I propose we quit the loop after about 5 minutes. If the timer expires, we will generate
a message reflecting this and report the journalctl/message log with whatever has been
captured up to that point.
The text was updated successfully, but these errors were encountered: