LoGrok reads and parses log files of arbitrary format and allows you to run queries against their data. LoGrok can parse standard Apache LogFormat strings to describe the format of the logdata and it can process mutliple logs at one time. LoGrok uses python's multiprocessing package to take full advantage of all of your CPUs ensuring the fastest parse and query time possible.
Requires Python2 2.7 or higher and the ply module
sudo python ./setup.py install
./logrok.py [-h] (-t TYPE | -f FORMAT) [-j PROCESSES] [-l LINES] [-i | -c] [-q QUERY] [-d] logfile [logfile ...]
- positional arguments:
- logfile
- optional arguments:
-h, --help show help message and exit -t TYPE, --type TYPE {syslog, apache-common-vhost, agent, apache-common, referer, ncsa-combined} Use built-in log type (default: apache-common) -f FORMAT, --format FORMAT Log format (use apache LogFormat string) (default: None) -j PROCESSES, --processes PROCESSES Number of processes to fork for log crunching (default: 12) -l LINES, --lines LINES Only process LINES lines of input (default: None) -i, --interactive Use line-based interactive interface (default: False) -q QUERY, --query QUERY The query to run (default: None) -d, --debug Turn debugging on (default: False)
- You probably want to run in interactive mode to avoid repeatedly parsing the log(s) at startup
- show <fields|headers>
- [select] <fieldlist> [from xxx] <where <wherelist>> [group by <fieldlist>] [order by <fieldlist>];
show fields;
lists available field namesshow headers;
alias forshow fields;
help;
prints a short help
select
is ignored, but can be passedfieldlist
can be any field or fields separated by commas; fields can also be function calls
Aggregate functions will calculate a total value for all rows
avg
calculates average for specified columnmean
alias foravg
median
calculates median value for specified columnmode
calculates mode for specified columncount
counts rowsmax
calculates max value in specified columnmin
calculates min value in specified column
Value functions will modify one value
div
divides first parameter by second parameteryear
returns year from date fieldmonth
returns month from date fieldday
returns day from date fieldhour
returns hour from date fieldminute
returns minute from date fieldsecond
returns second from date field
- select max(response_time_us), auth_user;
- select date_time, auth_user, request from log where auth_user <> 'bob_smith';
- select hour(date_time) as hr, avg(response_time_us) as resp_time, auth_user from log where auth_user <> 'bob_smith' group by auth_user, hr;
- select date_time, response_time_us where response_time_us > 5000000;
- select div(avg(response_time_us), 1000000) as resp_seconds, auth_user group by auth_user;
LoGrok is licensed under the MIT license. [1]
[1] | See the file LICENSE |