produce log-msgs for high-level errors, so diagnostic tool can evaluate #160

jti-lanl · 2016-09-08T16:31:35Z

Chris Hoffman has an idea for a high-level diagnostic tool that would see various kinds of high-level log messages, which it would try to abstract into system health evaluations. The idea is that our log is too verbose, or at least not yet categorized well enough, for this purpose. We should maybe produce top-top-level log messages (e.g. priority LOG_CRIT), in various circumstances (TBD). For example, maybe we note timeouts to the object-store, at this level. Then syslogd could be configured to only see messages at this level. Chris's tool could observe many timeouts to a specific server in a short time and conclude that there may be a system problem there. Or periodic timeouts to all servers might suggest that there is a networking problem. If he wanted, he could dynamically reload the syslog configuration to see more-verbose messages. Chris can add comments to clarify ...

brettkettering · 2016-09-12T22:45:23Z

I'm pretty sure we have such a logging package written by a CMU employee for PLFS. We should be able to pull that into service rather easily.

Brett

From: Jeff Inman [email protected]
Sent: Thursday, September 8, 2016 10:31:36 AM
To: mar-file-system/marfs
Subject: [mar-file-system/marfs] produce log-msgs for high-level errors, so diagnostic tool can evaluate (#160)

Chris Hoffman has an idea for a high-level diagnostic tool that would see various kinds of high-level log messages, which it would try to abstract into system health evaluations. The idea is that our log is too verbose, or at least not yet categorized well enough, for this purpose. We should maybe produce top-top-level log messages (e.g. priority LOG_CRIT), in various circumstances (TBD). For example, maybe we note timeouts to the object-store, at this level. Then syslogd could be configured to only see messages at this level. Chris's tool could observe many timeouts to a specific server in a short time and conclude that there may be a system problem there. Or periodic timeouts to all servers might suggest that there is a networking problem. If he wanted, he could dynamically reload the syslog configuration to see more-verbose messages. Chris can add comments to clarify ...

You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com//issues/160, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AB3mjEmwmmLTs7K5Gnhq9N2gmybKjd7Wks5qoDhogaJpZM4J4MVN.

jti-lanl added the enhancement label Sep 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

produce log-msgs for high-level errors, so diagnostic tool can evaluate #160

produce log-msgs for high-level errors, so diagnostic tool can evaluate #160

jti-lanl commented Sep 8, 2016

brettkettering commented Sep 12, 2016

produce log-msgs for high-level errors, so diagnostic tool can evaluate #160

produce log-msgs for high-level errors, so diagnostic tool can evaluate #160

Comments

jti-lanl commented Sep 8, 2016

brettkettering commented Sep 12, 2016