Skip to content

Commit

Permalink
Merge pull request #2510 from sebix/bot-docs
Browse files Browse the repository at this point in the history
improve documentation of mail collectors and CSV parser
  • Loading branch information
sebix committed Jul 9, 2024
2 parents c16a8d2 + 966d1e0 commit 96c8c83
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 4 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
- `intelmq.bots.outputs.smtp_batch.output`: Documentation on multiple recipients added (PR#2501 by Edvard Rejthar).

### Documentation
- Bots: Clarify some section of Mail collectors and the Generic CSV Parser (PR#2510 by Sebastian Wagner).

### Packaging

Expand Down
22 changes: 18 additions & 4 deletions docs/user/bots.md
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,7 @@ the line) or not. Defaults to true.
### Generic Mail URL Fetcher <div id="intelmq.bots.collectors.mail.collector_mail_url" />

Extracts URLs from e-mail messages and downloads the content from the URLs.
It uses the [`imbox`](https://github.com/martinrusev/imbox) library.

The resulting reports contain the following special fields:

Expand All @@ -360,6 +361,8 @@ The resulting reports contain the following special fields:
- `extra.email_message_id`: The email's message ID.
- `extra.file_name`: The file name of the downloaded file (extracted from the HTTP Response Headers if possible).

The fields can be used by parsers to identify the feed and are not automatically passed on to events.

**Chunking**

For line-based inputs the bot can split up large reports into smaller chunks. This is particularly important for setups
Expand Down Expand Up @@ -392,6 +395,10 @@ limitation set `chunk_size` to something like 384000000 (~384 MB).

(optional, boolean) Whether the mail server uses TLS or not. Defaults to true.

**`mail_starttls`**

(optional, boolean) Whether the mail server uses STARTTLS or not. Defaults to false.

**`folder`**

(optional, string) Folder in which to look for e-mail messages. Defaults to INBOX.
Expand Down Expand Up @@ -422,6 +429,7 @@ certificate is not found, the IMAP connection will fail on handshake. Defaults t
### Generic Mail Attachment Fetcher <div id="intelmq.bots.collectors.mail.collector_mail_attach" />

This bot collects messages from mailboxes and downloads the attachments.
It uses the [`imbox`](https://github.com/martinrusev/imbox) library.

The resulting reports contains the following special fields:

Expand All @@ -432,6 +440,8 @@ The resulting reports contains the following special fields:
- `extra.file_name`: The file name of the attachment or the file name in the attached archive if attachment is to
uncompress.

The fields can be used by parsers to identify the feed and are not automatically passed on to events.

**Module:** `intelmq.bots.collectors.mail.collector_mail_attach`

**Parameters (also expects [feed parameters](#feed-parameters)):**
Expand All @@ -442,7 +452,7 @@ The resulting reports contains the following special fields:

**`mail_port`**

(optional, integer) IMAP server port: 143 without TLS, 993 with TLS. Defaults to 143.
(optional, integer) IMAP server port: 143 without TLS, 993 with TLS. Default depends on SSL setting.

**`mail_user`**

Expand All @@ -456,6 +466,10 @@ The resulting reports contains the following special fields:

(optional, boolean) Whether the mail server uses TLS or not. Defaults to true.

**`mail_starttls`**

(optional, boolean) Whether to use STARTTLS before authenticating to the server. Defaults to false.

**`folder`**

(optional, string) Folder in which to look for e-mail messages. Defaults to INBOX.
Expand All @@ -466,7 +480,7 @@ The resulting reports contains the following special fields:

**`attach_regex`**

(optional, string) Regular expression of the name of the attachment. Defaults to csv.zip.
(optional, string) All attachments which match this [regular expression](https://docs.python.org/3/library/re.html#re.search) will be processed. Defaults to `csv.zip`.

**`extract_files`**

Expand Down Expand Up @@ -1697,8 +1711,8 @@ available with their index.

**`skip_header`**

(optional, boolean/integer) Whether to skip the first N lines of the input (True -> 1, False -> 0). Lines starting
with `#` will be skipped additionally, make sure you do not skip more lines than needed!
(optional, boolean/integer) Whether to skip the first N lines of the input (true equals to 1, false requalis to 0). Lines starting
with `#` will be skipped additionally, make sure you do not skip more lines than needed! Defaults to false/0.

**`time_format`**

Expand Down

0 comments on commit 96c8c83

Please sign in to comment.