Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring class for process data collection #5747

Conversation

jseg380
Copy link
Member

@jseg380 jseg380 commented Sep 18, 2024

Description

This PR includes the development done to implement Monitoring class for process data collection, as well as the documentation related to its use.

Related to wazuh/wazuh#24683

@jseg380 jseg380 self-assigned this Sep 18, 2024
Copy link
Member

@rafabailon rafabailon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Job! I've tested the code and it seems to work correctly. I've left a few comments, mostly about styles.

Copy link
Member

@Rebits Rebits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Job!

Some minor changes are requested.


Remember that some unitary tests are mandatory for new development

@Rebits
Copy link
Member

Rebits commented Sep 27, 2024

During the final review, several issues were detected regarding the current approach

Not handling of exception

Subprocess is killed during the monitoring

During the monitoring of a multiprocess program, it appears that failures

Traceback (most recent call last):
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_pslinux.py", line 1717, in wrapper
    return fun(self, *args, **kwargs)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_common.py", line 508, in wrapper
    raise raise_from(err, None)
  File "<string>", line 3, in raise_from
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_common.py", line 506, in wrapper
    return fun(self)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_pslinux.py", line 1780, in _parse_stat_file
    data = bcat("%s/%s/stat" % (self._procfs_path, self.pid))
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_common.py", line 851, in bcat
    return cat(fname, fallback=fallback, _open=open_binary)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_common.py", line 839, in cat
    with _open(fname) as f:
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_common.py", line 799, in open_binary
    return open(fname, "rb", buffering=FILE_READ_BUFFER_SIZE)
FileNotFoundError: [Errno 2] No such file or directory: '/proc/42457/stat'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/__init__.py", line 355, in _init
    self.create_time()
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/__init__.py", line 757, in create_time
    self._create_time = self._proc.create_time()
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_pslinux.py", line 1717, in wrapper
    return fun(self, *args, **kwargs)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_pslinux.py", line 1948, in create_time
    ctime = float(self._parse_stat_file()['create_time'])
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_pslinux.py", line 1726, in wrapper
    raise NoSuchProcess(self.pid, self._name)
psutil.NoSuchProcess: process no longer exists (pid=42457)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/process_resource_monitoring/monitor.py", line 244, in set_process
    self._proc = psutil.Process(self._pid)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/__init__.py", line 319, in __init__
    self._init(pid)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/__init__.py", line 368, in _init
    raise NoSuchProcess(pid, msg=msg)
psutil.NoSuchProcess: process PID not found (pid=42457)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/rebits/Wazuh/monitor/bin/wazuh-process-metrics", line 8, in <module>
    sys.exit(main())
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/wazuh_process_metrics.py", line 309, in main
    monitors_healthcheck(options, process_list)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/wazuh_process_metrics.py", line 221, in monitors_healthcheck
    if check_monitors_health(options):
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/wazuh_process_metrics.py", line 193, in check_monitors_health
    monitor = Monitor(
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/process_resource_monitoring/monitor.py", line 100, in __init__
    self.set_process()
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/process_resource_monitoring/monitor.py", line 246, in set_process
    raise ValueError(f'The process {self._process_name} is not running.') from err
ValueError: The process wazuh_apid.py_child_1 is not running.

The tool should be capable of handling this

The presence of zombie process makes the tool fail

If a zombie process is present in the system, the whole monitoring fails:

Traceback (most recent call last):
  File "/home/rebits/Wazuh/monitor/bin/wazuh-process-metrics", line 8, in <module>
    sys.exit(main())
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/wazuh_process_metrics.py", line 288, in main
    process_pids = Monitor.get_process_pids(process)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/process_resource_monitoring/monitor.py", line 118, in get_process_pids
    if any(filter(lambda x: f'{process_name}' in x, proc.cmdline())):
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/__init__.py", line 724, in cmdline
    return self._proc.cmdline()
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_pslinux.py", line 1717, in wrapper
    return fun(self, *args, **kwargs)
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_pslinux.py", line 1856, in cmdline
    self._raise_if_zombie()
  File "/home/rebits/Wazuh/monitor/lib/python3.10/site-packages/psutil/_pslinux.py", line 1761, in _raise_if_zombie
    raise ZombieProcess(self.pid, self._name, self._ppid)

Logging

Tool logging does not differ, contrary to the CSV data, from different iterations of the tool. This should be included in the same directory as the CSV data

Reports

  • Obtained reports are included in a difficult-to-read time-date format
  • Several CSVs are created using the same process. Even the child process should be included in the same CSV, otherwise, it would be really hard to handle this data
➜  Desktop tree -l  /tmp/process_metrics
/tmp/process_metrics
├── 27-09-2024
│   ├── 1727423503
│   │   ├── wazuh_apid_29459.csv
│   │   ├── wazuh_apid_39133.csv
│   │   ├── wazuh_apid_39134.csv
│   │   ├── wazuh_apid_39135.csv
│   │   ├── wazuh_apid_39136.csv
│   │   ├── wazuh_apid_39137.csv
│   │   ├── wazuh_apid_39242.csv
│   │   ├── wazuh_apid_39243.csv
│   │   ├── wazuh_apid_child_1.csv
│   │   ├── wazuh_apid_child_2.csv
│   │   ├── wazuh_apid_child_3.csv
│   │   ├── wazuh_apid_child_4.csv
│   │   ├── wazuh_apid_child_5.csv
│   │   └── wazuh_apid.csv
│   └── 1727425589
│       ├── wazuh_apid_29459.csv
│       ├── wazuh_apid_42457.csv
│       ├── wazuh_apid_42458.csv
│       ├── wazuh_apid_42459.csv
│       ├── wazuh_apid_42460.csv
│       ├── wazuh_apid_42461.csv
│       ├── wazuh_apid_42488.csv
│       ├── wazuh_apid_42489.csv
│       ├── wazuh_apid_child_1.csv
│       ├── wazuh_apid_child_2.csv
│       ├── wazuh_apid_child_3.csv
│       ├── wazuh_apid_child_4.csv
│       ├── wazuh_apid_child_5.csv
│       └── wazuh_apid.csv
└── wazuh-resource-metrics.log

@Rebits Rebits merged commit 18dd570 into enhacement/24586-benchmark-tests Sep 27, 2024
@Rebits Rebits deleted the enhancement/24683-process-resource-monitoring branch September 27, 2024 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Benchmarking tests: Monitoring class for process data collection
3 participants