Skip to content

Commit

Permalink
v2.5.10 => First 'offical' GitHub release
Browse files Browse the repository at this point in the history
Cleanup script documentation and merged pull request from Waipeng
  • Loading branch information
willemdh committed May 10, 2015
1 parent 5f04e43 commit cfec16f
Show file tree
Hide file tree
Showing 8 changed files with 34 additions and 807 deletions.
339 changes: 0 additions & 339 deletions LICENSE

This file was deleted.

339 changes: 0 additions & 339 deletions LICENSE.md

This file was deleted.

Binary file added NetApp/Thumbs.db
Binary file not shown.
135 changes: 18 additions & 117 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,129 +1,30 @@
Description:
A health monitoring script for NetApp Data ONTAP Cluster-Mode filers.
# Nagios plugin to check health of a NetApp Ontap cluster

Download:
https://github.com/willemdh/check_netapp_ontapi
### Idea

Known Issues:
- Quota_health not working like it should. Needs more testing..
- Exclude and include issues with quotas. Needs more testing..
This Perl script is able to monitor most components of a NetApp Ontap cluster, such as volume, aggregate,
snapshot, quota, snapmirror, firler hardware, port, interface cluster and disk health.

Project Status:
Working - Beta
### Status

Patch Notes:
0.8.6.11
- Set max records to 200 and removed space_to_bytes sub from $intUsedToBytes (no magnitude error) in calc_quota_health sub
- Updated script header and documentation
- Added if(defined..) to sub get_volume_space, because volumes in transferring mode for a syncing mirror, were causing errors
- Added if(defined..) to sub calc_space_health, because offline aggregates were reported to cause errors
- Updated code for quota_health, so correct sub is called atm
Poduction ready. Please visit http://outsideit.net/check-netapp-ontap for more information.

v0.6
- No longer stands in defiance of the laws of mathematics by attempting to divide by 0 when calculating disk health. (Thanks HW)
- No longer attempts to monitor a volume that is being moved or provisioned. (Thanks HW)
- Resolved a number of minor packaging and informational problems present in the 0.5 not-quite-release (Thanks WD)
### How To

v0.5
- *NEW* Quota monitoring (Warning: Completely untested and experimental. )
- *NEW* The new �n parameter allows you to filter the queries to only get information from a specific vhost or cluster node depending on the check, use the -�help parameter to get a list of which checks are filtered to which objects.
- check_netapp_ontapi.pl is now compatible with SDK 5.2 and hopefully OnTap 8.2.
- significant changes have been made to snapmirror monitoring so that it now works as intended.
- Physical port monitoring and vhost interface monitoring are now separate checks and can be accessed with check_port and check_interface respectively.
Please visit http://outsideit.net/check-netapp-ontap for more information on how to use this plugin.

v0.4
- Additional sanitization on disk_health check to prevent errors when a disk has no assigned home (Thanks WD).
### Help

v0.3:
- The package has been changed to include the required version of the Netapp SDK (Thanks WD).
In case you find a bug or have a feature request, please make an issue on GitHub.

v0.2:
- All pre-existing checks (volume, snapshot and aggregate) updated for better scability.
- Added six new check categories, see usage below for full list.
### On Nagios Exchange

v0.1:
- First release
http://exchange.nagios.org/directory/Plugins/Hardware/Storage-Systems/SAN-and-NAS/NetApp/Check-Netapp-Ontap/details

Usage:
1. Extract the contents of check_netapp_ontapi.zip to a temp directory and then navigate to it.
2. Copy the contents of NetApp/* to your /usr/lib/perl5 directory to install the required version of the NetApp Perl SDK.
3. Copy check_netapp_ontapi.pl script to your nagios libexec folder and set the correct permissions
### Copyright

--hostname, -H
Hostname or address of the cluster administrative interface.
--node, -n
Name of a vhost or cluster-node to restrict this query to.
--user, -u
Username of a Netapp Ontapi enabled user.
--password, -p
Password for the netapp Ontapi enabled user.
--option, -o
The name of the option you want to check. See the option and threshold list at the bottom of this help text.
--warning, -w
A custom warning threshold value. See the option and threshold list at the bottom of this help text.
--critical, -c
A custom warning threshold value. See the option and threshold list at the bottom of this help text.
--modifier, -m
This modifier is used to set an inclusive or exclusive filter on what you want to monitor.
--help, -h
Display this help text.

=========================================
Option List
=========================================
volume_health
desc: Check the space and inode health of a vServer volume. If space % and space in *B are both defined the smaller value of the two will be used when deciding if the volume is in a warning or critical state. This allows you to better accomodate large volume monitoring.
thresh: space % used, space in *B (i.e MB) remaining, inode count remaining, inode % used (Usage example: 80%i), "offline" keyword
node: The node option restricts this check by vserver name.

aggregate_health
desc: Check the space and inode health of a cluster aggregate. If space % and space in *B are both defined the smaller value of the two will be used when deciding if the volume is in a warning or critical state. This allows you to better accomodate large aggregate monitoring.
thresh: space % used, space in *B (i.e MB) remaining, inode count remaining, inode % used (Usage example: 80%i), "offline" keyword, "is-home" keyword
node: The node option restricts this check by cluster-node name.

snapshot_health
desc: Check the space and inode health of a vServer snapshot. If space % and space in *B are both defined the smaller value of the two will be used when deciding if the volume is in a warning or critical state. This allows you to better accomodate large snapshot monitoring.
thresh: space % used, space in *B (i.e MB) remaining, inode count remaining, inode % used (Usage example: 80%i), "offline" keyword
node: The node option restricts this check by vserver name.

quota_health
desc: Check that the space and file thresholds have not been crossed on a quota.
thresh: N/A storage defined.
node: The node option restricts this check by vserver name.

snapmirror_health
desc: Check the lag time and health flag of the snapmirror relationships.
thresh: snapmirror lag time (valid intervals are s, m, h, d).
node: The node options restricts this check by snapmirror destination cluster-node name.

filer_hardware_health
desc: Check the environment hardware health of the filers (fan, psu, temperature, battery).
thresh: component name (fan, psu, temperature, battery). There is no default alert level they MUST be defined.
node: The node option restricts this check by cluster-node name.

port_health
desc: Checks the state of a physical network port.
thresh: N/A not customizable.
node: The node option restricts this check by cluster-node name.

interface_health
desc: Check that a LIF is in the correctly configured state and that it is on its home node and port. Additionally checks the state of a physical port.
thresh: N/A not customizable.
node: The node option restricts this check by vserver name.

netapp_alarms
desc: Check for Netapp console alarms.
thresh: N/A not customizable.
node: The node option restricts this check by cluster-node name.

cluster_health
desc: Check the cluster disks for failure or other potentially undesirable states.
thresh: N/A not customizable.
node: The node option restricts this check by cluster-node name.

disk_health
desc: Check the health of the disks in the cluster.
thresh: Not customizable yet.
node: The node option restricts this check by cluster-node name.

For keyword thresholds, if you want to ignore alerts for that particular keyword you set it at the same threshold that the alert defaults to.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later
version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the
implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
details at <http://www.gnu.org/licenses/>.
28 changes: 16 additions & 12 deletions check_netapp_ontap.pl
Original file line number Diff line number Diff line change
@@ -1,21 +1,25 @@
#!/usr/bin/perl

# Script name: check_netapp_ontapi.pl
# Version: 0.8.6.6
# Original author: Murphy John
# Current developer: D'Haese Willem
# Purpose: Checks NetApp ontapi clusters for various problems, like volume, aggregate, snapshot,
# quota, snapmirror, filer hardware, port, interface, cluster and disk health, but also NetApp alarms
# On Github: https://github.com/willemdh/check_netapp_ontapi
# To do:
# - Completing quota monitoring
# - Test and integrate quota inclusion and exclusion
# - Integrate performance data
# History:
# Script name: check_netapp_ontap.pl
# Version: v2.5.10
# Original author: Murphy John
# Current author: D'Haese Willem
# Purpose: Checks NetApp ontapi clusters for various problems, like volume, aggregate, snapshot,
# quota, snapmirror, filer hardware, port, interface, cluster and disk health, but also NetApp alarms
# On Github: https://github.com/willemdh/check_netapp_ontap
# On OutsideIT: http://outsideit.net/check-netapp-ontap
# Recent History:
# 05/06/2014 => Set max records to 200 and removed space_to_bytes sub from $intUsedToBytes (no magnitude)
# 06/06/2014 => Updated script header and documentation, further testing with thresholds
# 10/06/2014 => Added if(defined..) to sub get_volume_space, becasue volumes in transferring mode for a syncing mirror, were causing errors
# 11/06/2014 => Merged John's 0.6 script with my fork after accepting the transferred project
# 10/05/2015 => Cleanup script documentation and merged pull request from Waipeng
# Copyright:
# This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published
# by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed
# in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public
# License along with this program. If not, see <http://www.gnu.org/licenses/>.

use warnings;
use strict;
Expand Down
Binary file removed check_netapp_ontap.zip
Binary file not shown.
Binary file added check_netapp_ontap_logical_view_01.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added check_netapp_ontap_multinode_cluster_01.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit cfec16f

Please sign in to comment.