Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring class for process data collection #5747

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
185 changes: 185 additions & 0 deletions deps/wazuh_testing/wazuh_testing/process_resource_monitoring/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
# Monitoring class for process data collection

This package contains two public modules designed for monitoring and tracking purposes:

## Monitor module

The `monitor` module provides a class for monitoring processes and their child processes. Monitoring is conducted concurrently, with a separate thread for each process. The following data is collected for each scan:

- **Daemon**: daemon name.
- **Version**: version.
- **Timestamp**: timestamp of the scan.
- **PID**: PID of the process.
- **CPU**(%): CPU percentage of the process. It can exceed 100% if the process uses multiple threads.
- **VMS**: Virtual Memory Size.
- **RSS**: Resident Set Size.
- **USS**: Unique Set Size.
- **PSS**: Proportional Set Size.
- **SWAP**: Memory of the process in the swap space.
- **FD**: File descriptors opened by the process.
- **Read_Ops**: Read operations.
- **Write_Ops**: Write operations.
- **Disk_Read**: Bytes read by the process.
- **Disk_Written**: Bytes written by the process.

## Disk Usage Tracker module

The `disk_usage_tracker` module provides a class to monitor file and directory disk usage. The following data is collected for each file or directory:

- **File:** Name of the file.
- **Timestamp:** Timestamp of the scan.
- **Path:** Full path of the file.
- **Size:** Size in selected units of the file.
- **Usage:** Percentage of the space the file takes relative to the partition's size.
- **Modification_time:** Last time the file was modified.
- **Access_time:** Last time the file was accessed.
- **Creation_time:** Creation time (Windows) or metadata change time (Unix).

The package also provides the scripts `wazuh-disk-metrics` and `wazuh-process-metrics`, designed to interact with these modules. [More information](#scripts)

## Directory structure

```shell script
process_resource_monitoring/
├── pyproject.toml
├── README.md
└── src
├── process_resource_monitoring
│   ├── disk_usage_tracker.py
│   ├── __init__.py
│   ├── _logger.py
│   └── monitor.py
└── wazuh_metrics.py
```


## Prerequisites

- Wazuh component(s) 4.9.0 (or greater)
Rebits marked this conversation as resolved.
Show resolved Hide resolved
- Python 3.7 (or greater)
- Python-pip (pip)


## Package installation

To use the monitoring class in any other Python scripts it is highly recommended to install the package. This can be achieved by following these steps:

```shell script
# Create a virtual environment
python3 -m venv virtualenv
source virtualenv/bin/activate

# Install the package using pip
python3 -m pip install .

# Verify the correct installation
python3 -m pip list | grep process_resource_monitoring
```

> Note:
> The use of a virtual environment is optional, but quite recommended to avoid polluting the global workspace.


## Scripts

### wazuh-process-metrics

This script takes as positional arguments the names of the processes to be monitored.

```shell script
wazuh-process-metrics [options] <process_name> [<process_name>,...]
```

#### Parameters

| Parameter | Description | Type | Default |
| --------- | ----------- | ---- | ------- |
| `<process_name_list>` | `Name of process/processes to monitor separated by whitespace.` | `str` | Required |
| `-s`, `--sleep` | `Time in seconds between each entry.` | `float` | `1.0` |
| `-u`, `--units` | `Unit for the process bytes-related values.` | `str` | `KB` |
| `-v`, `--version` | `Version of the binaries.` | `str` | `None` |
| `-d`, `--debug` | `Enable debug level logging.` | `store_true` | `False` |
| `-H`, `--healthcheck-time` | `Time in seconds between each health check.` | `int` | `10` |
| `-r`, `--retries` | `Number of reconnection retries before aborting the monitoring process.` | `int` | `10` |
| `--store-process` | `Path to store the CSVs with the process resource usage data.` | `str` | `` |


#### Usage examples

```shell script
# Min arguments: names of the processes to monitor (Process reference section)
wazuh-process-metrics authd analysisd

# Monitor api, cluster, mail and logcollector. Frequency 5s. Units to store the main memory values MB
wazuh-process-metrics apid clusterd maild logcollector -s 5 -u MB
```

### Process reference

#### Wazuh manager

| Process | Argument |
| ------- | -------- |
| wazuh-agentlessd | agentlessd |
| wazuh-analysisd | analysisd |
| wazuh_apid.py | apid |
| wazuh-authd | authd |
| wazuh_clusterd.py | clusterd |
| wazuh-csyslogd | csyslogd |
| wazuh-db | db |
| wazuh-dbd | dbd |
| wazuh-execd | execd |
| wazuh-integratord | integratord |
| wazuh-logcollector | logcollector |
| wazuh-maild | maild |
| wazuh-modulesd | modulesd |
| wazuh-monitord | monitord |
| wazuh-remoted | remoted |
| wazuh-syscheckd | syscheckd |

> Note:
> `wazuh_apid.py` and `wazuh_clusterd.py` are scripts run by the Python interpreter, not processes themselves.

#### Wazuh agent

| Process | Argument |
| ------- | -------- |
| wazuh-agentd | agentd |
| wazuh-execd | execd |
| wazuh-logcollector | logcollector |
| wazuh-modulesd | modulesd |
| wazuh-syscheckd | syscheckd |


### wazuh-disk-metrics

This script takes as positional arguments the names of the files or directories to be monitored.

```shell script
wazuh-disk-metrics [options] <file_path> [<file_path>,...]
```


#### Parameters

| Parameter | Description | Type | Default |
| --------- | ----------- | ---- | ------- |
| `<file_name_list>` | `Name of files/directories to monitor separated by whitespace.` | `str` | Required |
| `-s`, `--sleep` | `Time in seconds between each entry.` | `float` | `1.0` |
| `-u`, `--units` | `Unit for the disk usage related values.` | `str` | `GB` |
| `-v`, `--version` | `Version of the binaries.` | `str` | `None` |
| `-d`, `--debug` | `Enable debug level logging.` | `store_true` | `False` |
| `-H`, `--healthcheck-time` | `Time in seconds between each health check.` | `int` | `10` |
| `-r`, `--retries` | `Number of reconnection retries before aborting the monitoring process.` | `int` | `10` |
| `--store-disk` | `Path to store the CSVs with the disk usage data.` | `str` | `` |


#### Usage examples

```shell script
# Min arguments: names of the files to monitor
wazuh-disk-metrics /var/ossec/logs/archives/archives.json

# Monitor the archives file. Frequency 5s. Units to store the values MB
wazuh-disk-metrics /var/ossec/logs/archives/archives.json -s 5 -u MB
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
[project]
name = "process_resource_monitoring"
version = "0.0.1"
authors = [
{ name = "Wazuh", email = "[email protected]" }
]
description = "Monitoring resources and disk space usage of Wazuh components"
readme = "README.md"
requires-python = ">=3.7"
classifiers = [
"Development Status :: 2 - Pre-Alpha",

"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",

"License :: OSI Approved :: GNU General Public License v2 (GPLv2)",
"Operating System :: POSIX :: Linux"
]
keywords = ["wazuh", "process", "monitoring", "resource", "disk", "partition",
"usage"]

dependencies = [
"psutil>=6.0.0"
]


[project.urls]
Homepage = "https://wazuh.com/"
Documentation = "https://documentation.wazuh.com/"
Repository = "https://github.com/wazuh/"


[project.scripts]
wazuh-disk-metrics = "wazuh_disk_metrics:main"
wazuh-process-metrics = "wazuh_process_metrics:main"
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copyright (C) 2015-2024, Wazuh Inc.
# Created by Wazuh, Inc. <[email protected]>.
# This program is free software; you can redistribute it and/or modify it under the terms of GPLv2

"""Process resource usage monitoring tool.

This package contains the following modules:
rafabailon marked this conversation as resolved.
Show resolved Hide resolved

monitor -- Process and child processes monitoring using one process per
instance (one thread). Tracks several resource usage metrics

disk_usage_tracker -- Track disk usage of files/directories over time. Show
absolute and relative (to the partition) usage.

"""

rafabailon marked this conversation as resolved.
Show resolved Hide resolved

from process_resource_monitoring.disk_usage_tracker import DiskUsageTracker
from process_resource_monitoring.monitor import Monitor

rafabailon marked this conversation as resolved.
Show resolved Hide resolved
__all__ = ['Monitor', 'DiskUsageTracker']
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Copyright (C) 2015-2024, Wazuh Inc.
# Created by Wazuh, Inc. <[email protected]>.
# This program is free software; you can redistribute it and/or modify it under the terms of GPLv2

"""Logger instance shared by all the modules to track workflow."""
rafabailon marked this conversation as resolved.
Show resolved Hide resolved

rafabailon marked this conversation as resolved.
Show resolved Hide resolved

import logging

rafabailon marked this conversation as resolved.
Show resolved Hide resolved
logger = logging.getLogger('wazuh-monitor')

logger.setLevel(logging.INFO)
Loading