Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

produce log-msgs for high-level errors, so diagnostic tool can evaluate #160

Open
jti-lanl opened this issue Sep 8, 2016 · 1 comment
Open

Comments

@jti-lanl
Copy link
Contributor

jti-lanl commented Sep 8, 2016

Chris Hoffman has an idea for a high-level diagnostic tool that would see various kinds of high-level log messages, which it would try to abstract into system health evaluations. The idea is that our log is too verbose, or at least not yet categorized well enough, for this purpose. We should maybe produce top-top-level log messages (e.g. priority LOG_CRIT), in various circumstances (TBD). For example, maybe we note timeouts to the object-store, at this level. Then syslogd could be configured to only see messages at this level. Chris's tool could observe many timeouts to a specific server in a short time and conclude that there may be a system problem there. Or periodic timeouts to all servers might suggest that there is a networking problem. If he wanted, he could dynamically reload the syslog configuration to see more-verbose messages. Chris can add comments to clarify ...

@brettkettering
Copy link
Contributor

I'm pretty sure we have such a logging package written by a CMU employee for PLFS. We should be able to pull that into service rather easily.

Brett


From: Jeff Inman [email protected]
Sent: Thursday, September 8, 2016 10:31:36 AM
To: mar-file-system/marfs
Subject: [mar-file-system/marfs] produce log-msgs for high-level errors, so diagnostic tool can evaluate (#160)

Chris Hoffman has an idea for a high-level diagnostic tool that would see various kinds of high-level log messages, which it would try to abstract into system health evaluations. The idea is that our log is too verbose, or at least not yet categorized well enough, for this purpose. We should maybe produce top-top-level log messages (e.g. priority LOG_CRIT), in various circumstances (TBD). For example, maybe we note timeouts to the object-store, at this level. Then syslogd could be configured to only see messages at this level. Chris's tool could observe many timeouts to a specific server in a short time and conclude that there may be a system problem there. Or periodic timeouts to all servers might suggest that there is a networking problem. If he wanted, he could dynamically reload the syslog configuration to see more-verbose messages. Chris can add comments to clarify ...

You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com//issues/160, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB3mjEmwmmLTs7K5Gnhq9N2gmybKjd7Wks5qoDhogaJpZM4J4MVN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants