Ostrichpoll is a tiny utility for monitoring Twitter Ostrich services. Effectively it can monitor any service which exposes internal metrics in JSON over HTTP.
Features:
- not a daemon: Ostrichpoll is intended to used by
monit
or a similar tool as part of a health check for a proper service - validate multiple endpoints in a single execution
- specify normal ranges for metrics and gauges through YAML configuration
- support for "rate" measurements, specifying acceptable change per second for a metric. (useful for counters and other monotonic metrics)
$ gem install ostrichpoll
Ostrichpoll may be executed with a simple ostrichpoll
which will ensure a 200 response
is given from 127.0.0.1:9900/stats.json
.
You may change the endpoint:
ostrichpoll -u devbox:9999/stats.json
The interesting part of Ostrichpoll is when you give a configuration file that specifies normal ranges for individual metrics:
ostrichpoll -c pollconfig.yml
Use -d
to enable extra logging for debugging.
Configuration is specified in the format:
---
- url: 127.0.0.1:9900/stats.txt?period=10
rate_file: /tmp/parmesan-http-ostrich-rate.yml
validations:
counters/KafkaEventSink_messageDropped:
rate: true
normal_range: [0, 5]
missing: ignore
exit_code: 2
exit_message: "Message drop rate is too high"
metrics/KafkaEventSink_append_msec/p99:
normal_range: [0, 10]
missing: error
exit_code: 3
- url: 127.0.0.1:9901/stats.txt
...
url
- an endpoint which exposes JSON stats. If this endpoint is not available, Ostrichpoll will exit with-1
(really255
).rate_file
- where to store rate measurements between executions of Ostrichpoll.validations
- a list of validations to execute with the following options:- the key is the name of the metric. Nested metrics are supported through the
/
notation. rate
when set to true, compute the change per second since this value was last seen. You must specify arate_file
for this to work. Obviously, on the first execution of Ostrichpoll this validation will be ignored.normal_range
an array specifying the minimum and maximum (inclusive) values which are acceptable for this metricmissing
behavior if the metric is not seen in the output at all. (error by default)exit_code
what value Ostrichpoll should exit with if this error is seen.exit_message
what Ostrichpoll should output to STDOUT if this error is seen. "OK" is the default (if no errors are encountered).
- the key is the name of the metric. Nested metrics are supported through the
Ostrichpoll is setup to execute all validations on each execution. Even if one of the early validations fails, the output from all validations is logged to stderr. However, the exit code is the exit code from the first erroring validation.
Copyright 2012, Tumblr, Inc.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.