Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discover bots based on the entry points #2413

Merged
merged 14 commits into from
Nov 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
*.profile
.vscode/
.profile
intelmq.egg-info
*.egg-info
build
dist
*.old
Expand Down
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
### Core
- `intelmq.lib.message`: For invalid message keys, add a hint on the failure to the exception: not allowed by configuration or not matching regular expression (PR#2398 by Sebastian Wagner).
- `intelmq.lib.exceptions.InvalidKey`: Add optional parameter `additional_text` (PR#2398 by Sebastian Wagner).
- Change the way we discover bots to allow easy extending based on the entry point name. (PR#2413 by Kamil Mankowski)
- `intelmq.lib.mixins`: Add a new class, `StompMixin` (defined in a new submodule: `stomp`),
which provides certain common STOMP-bot-specific operations, factored out from
`intelmq.bots.collectors.stomp.collector` and `intelmq.bots.outputs.stomp.output`
Expand Down Expand Up @@ -68,6 +69,7 @@

### Documentation
- Add a readthedocs configuration file to fix the build fail (PR#2403 by Sebastian Wagner).
- Add a guide of developing extensions packages (PR#2413 by Kamil Mankowski)
- Update/fix/improve the stuff related to the STOMP bots and integration with the *n6*'s
Stream API (PR#2408 by Jan Kaliszewski).
- Complete documentation overhaul. Change to markdown format. Uses the mkdocs-material (PR#2419 by Filip Pokorný).
Expand Down
Empty file.
Empty file.
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
"""
SPDX-FileCopyrightText: 2023 CERT.at GmbH <https://cert.at/>
SPDX-License-Identifier: AGPL-3.0-or-later
"""

# Use your package as usual
from mybots.lib import common

from intelmq.lib.bot import CollectorBot
kamil-certat marked this conversation as resolved.
Show resolved Hide resolved


class ExampleAdditionalCollectorBot(CollectorBot):
"""
This is an example bot provided by an extension package
"""

def process(self):
report = self.new_report()
if self.raw: # noqa: Set as parameter
report['raw'] = common.return_value('example')
self.send_message(report)


BOT = ExampleAdditionalCollectorBot
Empty file.
9 changes: 9 additions & 0 deletions contrib/example-extension-package/mybots/lib/common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""

SPDX-FileCopyrightText: 2023 CERT.at GmbH <https://cert.at/>
SPDX-License-Identifier: AGPL-3.0-or-later
"""


def return_value(value):
return value
44 changes: 44 additions & 0 deletions contrib/example-extension-package/setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
"""Example IntelMQ extension package

SPDX-FileCopyrightText: 2023 CERT.at GmbH <https://cert.at/>
SPDX-License-Identifier: AGPL-3.0-or-later
"""

from pathlib import Path
from setuptools import find_packages, setup


# Instead of the bot-autodiscovery below, you can also just manually declare entrypoints
# (regardless of packaging solution, even in pyproject.toml etc.), e.g.:
#
# 'intelmq.bots.collectors.custom.collector = mybots.bots.collectors.custom.collector:BOT.run'
#
# Important is:
# - entry point has to start with `intelmq.bots.{type}` (type: collectors, experts, parsers, outputs)
# - target has to end with `:BOT.run`
# - entry points have to be in `console_scripts` group


BOTS = []

base_path = Path(__file__).parent / 'mybots/bots'
botfiles = [botfile for botfile in Path(base_path).glob('**/*.py') if botfile.is_file() and not botfile.name.startswith('_')]
for file in botfiles:
file = Path(str(file).replace(str(base_path), 'intelmq/bots'))
entry_point = '.'.join(file.with_suffix('').parts)
file = Path(str(file).replace('intelmq/bots', 'mybots/bots'))
module = '.'.join(file.with_suffix('').parts)
BOTS.append('{0} = {1}:BOT.run'.format(entry_point, module))

setup(
name='intelmq-example-extension',
version='1.0.0', # noqa: F821
maintainer='Your Name',
maintainer_email='[email protected]',
packages=find_packages(),
license='AGPLv3',
description='This is an example package to demonstrate how ones can extend IntelMQ.',
entry_points={
'console_scripts': BOTS
},
)
1 change: 1 addition & 0 deletions debian/control
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ Depends: bash-completion,
python3-ruamel.yaml,
python3-termstyle (>= 0.1.10),
python3-tz,
python3-importlib-metadata,
kamil-certat marked this conversation as resolved.
Show resolved Hide resolved
redis-server,
systemd,
${misc:Depends},
Expand Down
60 changes: 60 additions & 0 deletions docs/dev/extensions-packages.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
<!-- comment
SPDX-FileCopyrightText: 2023 CERT.at GmbH
SPDX-License-Identifier: AGPL-3.0-or-later
-->

# Creating extensions packages

IntelMQ supports adding additional bots using your own independent packages. You can use this to
add a new integration that is special to you, or cannot be integrated
into the main IntelMQ repository for some reason.

## Building an extension package

A simple example of the package can be found in ``contrib/example-extension-package``. To make your custom
bots work with IntelMQ, you need to ensure that

- your bot's module exposes a ``BOT`` object of the class inherited from ``intelmq.lib.bot.Bot``
or its subclasses,
- your package registers an [entry point](https://packaging.python.org/en/latest/specifications/entry-points/)
in the ``console_scripts`` group with a name starting with ``intelmq.bots.`` followed by
the name of the group (collectors, experts, outputs, parsers), and then your original name.
The entry point must point to the ``BOT.run`` method,
- the module in which the bot resides must be importable by IntelMQ (e.g. installed in the same
virtualenv, if you use them).

Apart from these requirements, your package can use any of the usual package features. We strongly
recommend following the same principles and main guidelines as the official bots. This will ensure
the same experience when using official and additional bots.

## Naming convention

Building your own extensions gives you a lot of freedom, but it's important to know that if your
bot's entry point uses the same name as another bot, it may not be possible to use it, or to
determine which one is being used. For this reason, we recommend that you start the name of your
bot with an with an organization identifier and then the bot name.

For example, if I create a collector bot for feed source ``Special`` and run it on behalf of the
organization ``Awesome``, the suggested entry point might be ``intelmq.bots.collectors.awesome.special``.
Note that the structure of your package doesn't matter, as long as it can be imported properly.

For example, I could create a package called ``awesome-bots`` with the following file structure

```text
awesome_bots
├── pyproject.toml
└── awesome_bots
├── __init__.py
└── special.py
```

The [pyproject.toml](https://packaging.python.org/en/latest/specifications/declaring-project-metadata/#entry-points)
file would then have the following section:

```ini
[project.scripts]
intelmq.bots.collectors.awesome.special = "awesome_bots.special:BOT.run"
```

Once you have installed your package, you can run ``intelmqctl list bots`` to check if your bot was
properly registered.
2 changes: 1 addition & 1 deletion intelmq/bin/intelmqctl.py
Original file line number Diff line number Diff line change
Expand Up @@ -931,7 +931,7 @@ def check(self, no_connections=False, check_executables=True):
if bot_id != 'global':
# importable module
try:
bot_module = importlib.import_module(bot_config['module'])
bot_module = importlib.import_module(utils.get_bot_module_name(bot_config['module']))
except ImportError as exc:
check_logger.error('Incomplete installation: Bot %r not importable: %r.', bot_id, exc)
retval = 1
Expand Down
2 changes: 1 addition & 1 deletion intelmq/lib/bot_debugger.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ def __init__(self, runtime_configuration, bot_id, run_subcommand=None, console_t
self.dryrun = dryrun
self.msg = msg
self.show = show
module = import_module(self.runtime_configuration['module'])
module = import_module(utils.get_bot_module_name(self.runtime_configuration['module']))

if loglevel:
self.leverageLogger(loglevel)
Expand Down
42 changes: 33 additions & 9 deletions intelmq/lib/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
import textwrap
import traceback
import zipfile
from pathlib import Path
from sys import version_info
from typing import (Any, Callable, Dict, Generator, Iterator, Optional,
Sequence, Union)

Expand All @@ -43,7 +43,6 @@
import dns.version
import requests
from dateutil.relativedelta import relativedelta
from pkg_resources import resource_filename
from ruamel.yaml import YAML
from ruamel.yaml.scanner import ScannerError
from termstyle import red
Expand All @@ -52,6 +51,12 @@
from intelmq import RUNTIME_CONF_FILE
from intelmq.lib.exceptions import DecodingError

try:
from importlib.metadata import entry_points
except ImportError:
from importlib_metadata import entry_points


__all__ = ['base64_decode', 'base64_encode', 'decode', 'encode',
'load_configuration', 'load_parameters', 'log', 'parse_logline',
'reverse_readline', 'error_message_from_exc', 'parse_relative',
Expand Down Expand Up @@ -839,6 +844,27 @@ def file_name_from_response(response: requests.Response) -> str:
return file_name


def _get_console_entry_points():
# Select interface was introduced in Python 3.10 and newer importlib_metadata
entries = entry_points()
if hasattr(entries, "select"):
return entries.select(group="console_scripts")
return entries.get("console_scripts", []) # it's a dict


def get_bot_module_name(bot_name: str) -> str:
entries = entry_points()
if hasattr(entries, "select"):
entries = tuple(entries.select(name=bot_name, group="console_scripts"))
else:
entries = [entry for entry in entries.get("console_scripts", []) if entry.name == bot_name]

if not entries:
return None
else:
return entries[0].value.replace(":BOT.run", '')


def list_all_bots() -> dict:
"""
Compile a dictionary with all bots and their parameters.
Expand All @@ -860,13 +886,11 @@ def list_all_bots() -> dict:
from intelmq.lib.bot import Bot # noqa: prevents circular import
bot_parameters = dir(Bot)

base_path = resource_filename('intelmq', 'bots')

botfiles = [botfile for botfile in pathlib.Path(base_path).glob('**/*.py') if botfile.is_file() and botfile.name != '__init__.py']
for file in botfiles:
file = Path(file.as_posix().replace(base_path, 'intelmq/bots'))
bot_entrypoints = filter(lambda entry: entry.name.startswith("intelmq.bots."), _get_console_entry_points())
kamil-certat marked this conversation as resolved.
Show resolved Hide resolved
for bot in bot_entrypoints:
try:
mod = importlib.import_module('.'.join(file.with_suffix('').parts))
module_name = bot.value.replace(":BOT.run", '')
mod = importlib.import_module(module_name)
except SyntaxError:
# Skip invalid bots
continue
Expand All @@ -884,7 +908,7 @@ def list_all_bots() -> dict:
for bot_type in ['CollectorBot', 'ParserBot', 'ExpertBot', 'OutputBot', 'Bot']:
name = name.replace(bot_type, '')

bots[file.parts[2].capitalize()[:-1]][name] = {
bots[module_name.split('.')[2].capitalize()[:-1]][name] = {
"module": mod.__name__,
"description": "Missing description" if not getattr(mod.BOT, '__doc__', None) else textwrap.dedent(mod.BOT.__doc__).strip(),
"parameters": keys,
Expand Down
12 changes: 12 additions & 0 deletions intelmq/tests/bin/test_intelmqctl.py
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,18 @@ def test_check_handles_syntaxerror_when_importing_bots(self):
self.assertIsNotNone(
next(filter(lambda l: "SyntaxError in bot 'test-bot'" in l, captured.output)))

@skip_installation()
@mock.patch.object(utils, "get_bot_module_name", mock.Mock(return_value="mocked-module"))
def test_check_imports_real_bot_module(self):
self._load_default_harmonization()
self._extend_config(self.tmp_runtime, self.BOT_CONFIG)

# raise SyntaxError to stop checking after import
with mock.patch.object(ctl.importlib, "import_module", mock.Mock(side_effect=SyntaxError)) as import_mock:
self.intelmqctl.check(no_connections=True, check_executables=False)

import_mock.assert_called_once_with("mocked-module")


if __name__ == '__main__': # pragma: nocover
unittest.main()
42 changes: 38 additions & 4 deletions intelmq/tests/lib/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,12 @@
from intelmq.lib.test import skip_internet
from intelmq.tests.test_conf import CerberusTests

try:
from importlib.metadata import EntryPoint
except ImportError:
from importlib_metadata import EntryPoint


LINES = {'spare': ['Lorem', 'ipsum', 'dolor'],
'short': ['{}: Lorem', '{}: ipsum',
'{}: dolor'],
Expand Down Expand Up @@ -318,6 +324,34 @@ def _mock_importing(module):
bot_count = sum([len(val) for val in bots.values()])
self.assertEqual(1, bot_count)

def test_list_all_bots_filters_entrypoints(self):
entries = [
EntryPoint("intelmq.bots.collector.api.collector_api",
"intelmq.bots.collector.api.collector_api:BOT.run", group="console_scripts"),
EntryPoint("intelmq.bots.collector.awesome.my_bot",
"awesome.extension.package.collector:BOT.run", group="console_scripts"),
EntryPoint("not.a.bot", "not.a.bot:run", group="console_scripts")
]

with unittest.mock.patch.object(utils, "_get_console_entry_points", return_value=entries):
with unittest.mock.patch.object(utils.importlib, "import_module") as import_mock:
import_mock.side_effect = SyntaxError() # stop processing after import try
utils.list_all_bots()

import_mock.assert_has_calls(
[
unittest.mock.call("intelmq.bots.collector.api.collector_api"),
unittest.mock.call("awesome.extension.package.collector"),
]
)
self.assertEqual(2, import_mock.call_count)

def test_get_bot_module_name_builtin_bot(self):
found_name = utils.get_bot_module_name("intelmq.bots.collectors.api.collector_api")
self.assertEqual("intelmq.bots.collectors.api.collector_api", found_name)

self.assertIsNone(utils.get_bot_module_name("intelmq.not-existing-bot"))

def test_get_bots_settings(self):
with unittest.mock.patch.object(utils, "get_runtime", new_get_runtime):
runtime = utils.get_bots_settings()
Expand Down Expand Up @@ -353,14 +387,14 @@ def test_load_configuration_yaml(self):
filename = os.path.join(os.path.dirname(__file__), '../assets/example.yaml')
self.assertEqual(utils.load_configuration(filename),
{
'some_string': 'Hello World!',
'other_string': 'with a : in it',
'some_string': 'Hello World!',
'other_string': 'with a : in it',
'now more': ['values', 'in', 'a', 'list'],
'types': -4,
'other': True,
'final': 0.5,
}
)
}
)

def test_load_configuration_yaml_invalid(self):
""" Test load_configuration with an invalid YAML file """
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ nav:
- Data Format: 'dev/data-format.md'
- Adding Feeds: 'dev/adding-feeds.md'
- Bot Development: 'dev/bot-development.md'
- Extensions Packages: 'dev/extensions-packages.md'
- Testing: 'dev/testing.md'
- Documentation: 'dev/documentation.md'
- Use as Library: 'dev/library.md'
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
'redis>=2.10',
'requests>=2.2.0',
'ruamel.yaml',
'importlib-metadata; python_version < "3.8"'
kamil-certat marked this conversation as resolved.
Show resolved Hide resolved
]

TESTS_REQUIRES = [
Expand Down
Loading