Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CheckMK] monitoring should not alert on PB-size mounted disk space #5544

Open
acozine opened this issue Nov 19, 2024 · 0 comments
Open

[CheckMK] monitoring should not alert on PB-size mounted disk space #5544

acozine opened this issue Nov 19, 2024 · 0 comments
Assignees
Labels
bug Operations pulls issues into the Operations ZenHub board

Comments

@acozine
Copy link
Contributor

acozine commented Nov 19, 2024

Expected behavior

A mounted disk that has hundreds of GB of space free should not alert just because the total disk used is above 80%. For example, /mnt/diglibdata1/pas is alerting because 1.30 PiB are used of 1.40 PiB. True, that is 93% usage, but 0.10 PiB == 100,000 GB. We have plenty of space.

Actual behavior

CheckMK returns a critical error.

Steps to replicate

Go to staging CheckMK and look at the slavery-staging1 host entry.

Impact of this bug

We get pure-noise alerts and may ignore alerts we should be paying attention to.

Implementation notes, if any

We solved this for basic disk space by creating a rule called Adjustable filesystem warnings. However, the rule is either incorrectly configured or does not apply to CIFS mounted disks. We are seeing the same critical alerts on the bibdata-staging machines, which also mount drives from diglibdata1.

@acozine acozine added the bug label Nov 19, 2024
@acozine acozine added the Operations pulls issues into the Operations ZenHub board label Jan 6, 2025
@aruiz1789 aruiz1789 self-assigned this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Operations pulls issues into the Operations ZenHub board
Projects
None yet
Development

No branches or pull requests

2 participants