-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
v2.5.10 => First 'offical' GitHub release
Cleanup script documentation and merged pull request from Waipeng
- Loading branch information
Showing
8 changed files
with
34 additions
and
807 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,129 +1,30 @@ | ||
Description: | ||
A health monitoring script for NetApp Data ONTAP Cluster-Mode filers. | ||
# Nagios plugin to check health of a NetApp Ontap cluster | ||
|
||
Download: | ||
https://github.com/willemdh/check_netapp_ontapi | ||
### Idea | ||
|
||
Known Issues: | ||
- Quota_health not working like it should. Needs more testing.. | ||
- Exclude and include issues with quotas. Needs more testing.. | ||
This Perl script is able to monitor most components of a NetApp Ontap cluster, such as volume, aggregate, | ||
snapshot, quota, snapmirror, firler hardware, port, interface cluster and disk health. | ||
|
||
Project Status: | ||
Working - Beta | ||
### Status | ||
|
||
Patch Notes: | ||
0.8.6.11 | ||
- Set max records to 200 and removed space_to_bytes sub from $intUsedToBytes (no magnitude error) in calc_quota_health sub | ||
- Updated script header and documentation | ||
- Added if(defined..) to sub get_volume_space, because volumes in transferring mode for a syncing mirror, were causing errors | ||
- Added if(defined..) to sub calc_space_health, because offline aggregates were reported to cause errors | ||
- Updated code for quota_health, so correct sub is called atm | ||
Poduction ready. Please visit http://outsideit.net/check-netapp-ontap for more information. | ||
|
||
v0.6 | ||
- No longer stands in defiance of the laws of mathematics by attempting to divide by 0 when calculating disk health. (Thanks HW) | ||
- No longer attempts to monitor a volume that is being moved or provisioned. (Thanks HW) | ||
- Resolved a number of minor packaging and informational problems present in the 0.5 not-quite-release (Thanks WD) | ||
### How To | ||
|
||
v0.5 | ||
- *NEW* Quota monitoring (Warning: Completely untested and experimental. ) | ||
- *NEW* The new �n parameter allows you to filter the queries to only get information from a specific vhost or cluster node depending on the check, use the -�help parameter to get a list of which checks are filtered to which objects. | ||
- check_netapp_ontapi.pl is now compatible with SDK 5.2 and hopefully OnTap 8.2. | ||
- significant changes have been made to snapmirror monitoring so that it now works as intended. | ||
- Physical port monitoring and vhost interface monitoring are now separate checks and can be accessed with check_port and check_interface respectively. | ||
Please visit http://outsideit.net/check-netapp-ontap for more information on how to use this plugin. | ||
|
||
v0.4 | ||
- Additional sanitization on disk_health check to prevent errors when a disk has no assigned home (Thanks WD). | ||
### Help | ||
|
||
v0.3: | ||
- The package has been changed to include the required version of the Netapp SDK (Thanks WD). | ||
In case you find a bug or have a feature request, please make an issue on GitHub. | ||
|
||
v0.2: | ||
- All pre-existing checks (volume, snapshot and aggregate) updated for better scability. | ||
- Added six new check categories, see usage below for full list. | ||
### On Nagios Exchange | ||
|
||
v0.1: | ||
- First release | ||
http://exchange.nagios.org/directory/Plugins/Hardware/Storage-Systems/SAN-and-NAS/NetApp/Check-Netapp-Ontap/details | ||
|
||
Usage: | ||
1. Extract the contents of check_netapp_ontapi.zip to a temp directory and then navigate to it. | ||
2. Copy the contents of NetApp/* to your /usr/lib/perl5 directory to install the required version of the NetApp Perl SDK. | ||
3. Copy check_netapp_ontapi.pl script to your nagios libexec folder and set the correct permissions | ||
### Copyright | ||
|
||
--hostname, -H | ||
Hostname or address of the cluster administrative interface. | ||
--node, -n | ||
Name of a vhost or cluster-node to restrict this query to. | ||
--user, -u | ||
Username of a Netapp Ontapi enabled user. | ||
--password, -p | ||
Password for the netapp Ontapi enabled user. | ||
--option, -o | ||
The name of the option you want to check. See the option and threshold list at the bottom of this help text. | ||
--warning, -w | ||
A custom warning threshold value. See the option and threshold list at the bottom of this help text. | ||
--critical, -c | ||
A custom warning threshold value. See the option and threshold list at the bottom of this help text. | ||
--modifier, -m | ||
This modifier is used to set an inclusive or exclusive filter on what you want to monitor. | ||
--help, -h | ||
Display this help text. | ||
|
||
========================================= | ||
Option List | ||
========================================= | ||
volume_health | ||
desc: Check the space and inode health of a vServer volume. If space % and space in *B are both defined the smaller value of the two will be used when deciding if the volume is in a warning or critical state. This allows you to better accomodate large volume monitoring. | ||
thresh: space % used, space in *B (i.e MB) remaining, inode count remaining, inode % used (Usage example: 80%i), "offline" keyword | ||
node: The node option restricts this check by vserver name. | ||
|
||
aggregate_health | ||
desc: Check the space and inode health of a cluster aggregate. If space % and space in *B are both defined the smaller value of the two will be used when deciding if the volume is in a warning or critical state. This allows you to better accomodate large aggregate monitoring. | ||
thresh: space % used, space in *B (i.e MB) remaining, inode count remaining, inode % used (Usage example: 80%i), "offline" keyword, "is-home" keyword | ||
node: The node option restricts this check by cluster-node name. | ||
|
||
snapshot_health | ||
desc: Check the space and inode health of a vServer snapshot. If space % and space in *B are both defined the smaller value of the two will be used when deciding if the volume is in a warning or critical state. This allows you to better accomodate large snapshot monitoring. | ||
thresh: space % used, space in *B (i.e MB) remaining, inode count remaining, inode % used (Usage example: 80%i), "offline" keyword | ||
node: The node option restricts this check by vserver name. | ||
|
||
quota_health | ||
desc: Check that the space and file thresholds have not been crossed on a quota. | ||
thresh: N/A storage defined. | ||
node: The node option restricts this check by vserver name. | ||
|
||
snapmirror_health | ||
desc: Check the lag time and health flag of the snapmirror relationships. | ||
thresh: snapmirror lag time (valid intervals are s, m, h, d). | ||
node: The node options restricts this check by snapmirror destination cluster-node name. | ||
|
||
filer_hardware_health | ||
desc: Check the environment hardware health of the filers (fan, psu, temperature, battery). | ||
thresh: component name (fan, psu, temperature, battery). There is no default alert level they MUST be defined. | ||
node: The node option restricts this check by cluster-node name. | ||
|
||
port_health | ||
desc: Checks the state of a physical network port. | ||
thresh: N/A not customizable. | ||
node: The node option restricts this check by cluster-node name. | ||
|
||
interface_health | ||
desc: Check that a LIF is in the correctly configured state and that it is on its home node and port. Additionally checks the state of a physical port. | ||
thresh: N/A not customizable. | ||
node: The node option restricts this check by vserver name. | ||
|
||
netapp_alarms | ||
desc: Check for Netapp console alarms. | ||
thresh: N/A not customizable. | ||
node: The node option restricts this check by cluster-node name. | ||
|
||
cluster_health | ||
desc: Check the cluster disks for failure or other potentially undesirable states. | ||
thresh: N/A not customizable. | ||
node: The node option restricts this check by cluster-node name. | ||
|
||
disk_health | ||
desc: Check the health of the disks in the cluster. | ||
thresh: Not customizable yet. | ||
node: The node option restricts this check by cluster-node name. | ||
|
||
For keyword thresholds, if you want to ignore alerts for that particular keyword you set it at the same threshold that the alert defaults to. | ||
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public | ||
License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later | ||
version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the | ||
implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more | ||
details at <http://www.gnu.org/licenses/>. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.