Skip to content

Commit

Permalink
Merge pull request #2378 from kamil-certat/fix_validating_url
Browse files Browse the repository at this point in the history
FIX: Ensure rejecting URLs starting with space
  • Loading branch information
sebix committed Jun 27, 2023
2 parents 61c45ac + 834290c commit 5bb8b78
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,6 @@ jobs:
- name: Checkout repository
uses: actions/checkout@v2
- name: Install codespell
run: pip install codespell
run: pip install "codespell==2.2.4"
- name: Run codespell
run: /home/runner/.local/bin/codespell
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,12 @@ CHANGELOG
- `intelmq.lib.upgrages`: Fix a bug in the upgrade function for version 3.1.0 which caused an exception if a generic csv parser instance had no parameter `type` (PR#2319 by Filip Pokorný).
- `intelmq.lib.datatypes`: Adds `TimeFormat` class to be used for the `time_format` bot parameter (PR#2329 by Filip Pokorný).
- `intelmq.lib.exceptions`: Fixes a bug in `InvalidArgument` exception (PR#2329 by Filip Pokorný).
- `intelmq.lib.harmonization`: Changes signature and names of `DateTime` conversion functions for consistency, backwards compatible (PR#2329 by Filip Pokorný).
- `intelmq.lib.harmonization`:
- Changes signature and names of `DateTime` conversion functions for consistency, backwards compatible (PR#2329 by Filip Pokorný).
- Ensure rejecting URLs with leading whitespaces after changes in CPython (fixes [#2377](https://github.com/certtools/intelmq/issues/2377))

### Development
- CI: pin the Codespell version to omit troubles caused by its new releases (PR #2379).

### Bots

Expand Down Expand Up @@ -63,6 +66,7 @@ CHANGELOG
- SECURITY: fixed a low-risk bug causing the tool to change owner of `/` if run with the `INTELMQ_PATHS_NO_OPT` environment variable set. This affects only the PIP package as the DEB/RPM packages don't contain this tool. (PR#2355 by Kamil Mańkowski, fixes #2354)

### Known Errors
- `intelmq.parsers.html_table` may not process invalid URLs in patched Python version due to changes in `urllib`. See #2382

3.1.0 (2023-02-10)
------------------
Expand Down
4 changes: 4 additions & 0 deletions intelmq/lib/harmonization.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
import json
import re
import socket
import string
import warnings
import urllib.parse as parse
from typing import Optional, Union
Expand Down Expand Up @@ -1090,6 +1091,9 @@ def is_valid(value: str, sanitize: bool = False) -> bool:
if not GenericType.is_valid(value):
return False

if value[0] in string.whitespace:
return False

result = parse.urlsplit(value)
if result.netloc == "":
return False
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ def test_event_with_split(self):
self.run_bot()
self.assertMessageEqual(0, EXAMPLE_EVENT)

@unittest.skip("Change in urllib prevent invalid URLs to be processed, see #2377")
def test_event_without_split(self):
self.sysconfig = {"columns": ["time.source", "source.url", "malware.hash.md5",
"source.ip", "__IGNORE__"],
Expand Down

0 comments on commit 5bb8b78

Please sign in to comment.