-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME
87 lines (70 loc) · 5.9 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
-----------------------------------------------
SERVER UPTIME MONITOR
-----------------------------------------------
This is a standalone server uptime monitor, designed and coded by Nicole De Luca (c/o '21).
It was created to allow for modular testing of server uptime with SSH and HTTP ports (22 and 80)
to monitor server status.
SYNOPSIS:
The program works by looping through all desired servers - hostnames or IPs - and testing their
response to a netcat connection. If it responds with an affirmative, the date and time is logged
in a log file (in /logs) sorted by server. If the connection times out after 5
seconds when attempting a netcat connection, it logs the server as down and sends an email to a
specified email address notifying that the server is down with the same date/time information.
FILES:
exec.sh: runs a for loop through each of the config files to run the correct testing script based
on their ports (i.e. SSH vs HTTP). If you want to test a different port, you can copy/paste the for
loop and just create a new config file with a new testing script.
servup-*.cfg: CFG files that contain the list of servers (as said before, hostnames or IPs) that are
being tested. Feel free to edit these and add or remove whatever servers you want to test.
test_*.sh: these are the scripts that are run by ./exec.sh that take in a server name as an
argument and then test netcat connection. In them,
they spawn a netcat connection and use expect to respond to the output. From this output, the
scripts run another script to parse to a log file (and email if necessary). If you're creating a
new one of these files to test for a different port, just change the port number in the line with
spawn netcat (in the default files, these ports are either 22 for SSH or 80 for HTTP) as well as
the lines that run the succeed.sh and timed_out.sh scripts, as the port number is a required
argument for those scripts if you want the port number to be in the logs.
succeeded.sh: takes in an argument for the server that needs to be logged and port that was tested.
Outputs to a log file with parsed date and time (seen in the variable declarations at the top) that
the server and port passed in were succesfully connected to (i.e. server is up).
timed_out.sh: takes in the same arguments as the previous script but outputs to a log that the
connection to the server timed out (i.e. that the server is down). It also sends an email alerting
to this fact as specified in the synopsis. You can change the email address in the mail line to
whatever email address you want this mail to go to.
logs/parse.sh: enters each of the log folders inside logs/ and, for every month in the year, tests
to see if there are logs present for the current year in that month, and if so, iterates through
each of the log files from that month, creating a new file that contains the percentage of time
that the server was up that month for each day. This cuts down the amount of space required to
store a year's worth of logs from 1 file a day with one line every 10 minutes (or however long
you have the cronjob set for) for every server to just one file a month with 28-31 lines per
server. The parsing script is meant to be run once a month on the first of the month. The ability
to parse through every month was just so that you can implement this script on your logs even if
you've been collecting logs for a long time and still want to parse them; this script will be able
to go through all your prior logs and parse them (at least, for the current or priot year. Minor
modifications might be necessary for prior years, but it should be self explanatory). This script
DOES NOT modify logs; only parse them and create new files.
logs/rotate.sh: enters each folder and creates a tar file containing all of the logs from the prior
month (specifically yesterday's month, as seen in the script), moves that tar to a subfolder called
parsed_logs/, and then deletes all those logs. This script is meant to be the companion script to
the above parse.sh; running parse.sh consolidates all the data from those logs to a smaller size,
and then rotate.sh actually compresses them and funnels the compressed logs away in case they are
needed at a later date. This also facilitates easier deletion of the logs in the future. This
script is also meant to be ran once a month on the first of the month. Running it without command
line arguments uses yesterday's month as the month for rotating (i.e. it tars and deletes last
month's logs). However, you can run it with command line arguments to choose the months for which
you want to rotate data. For example, running it with the arguments "March September December"
will check each log folder for logs that were made in the current year (or last year if running
in January) in March, September, or December. If they exist, it rotates them as described above.
If they do not exist, it does nothing. This facilitates easy rotation of many months of logs even
if certain servers were not added at the same time and may not have similar amounts and dates of
logged uptime.
USAGE:
If there is not already one running, start a cron job to run ./exec.sh every 10 minutes or so.
I don't believe anything more often than 5 minutes is necessary. You can run the script manually if
you want a realtime update of the server status. Cronjobs should also be added for logs/parse.sh
and logs/rotate.sh for the first of every month; I recommend 12:05 AM for parse.sh (so that it goes in between logging times, if you have the exec.sh cronjob set to run every 10 minutes) and 12:15 AM
for rotate.sh so that it won't be running at the same time as anything else (though, you could have
it run any time on the first, since it uses yesterday's month and year as the rotation data).
If you've just cloned this, edit all the files that have _EDITME appended to them. It should be
obvious what you'll need to change; read this file again if you can't figure out what any of the
files are supposed to be doing.