Python 3 compatible #21

threatlead · 2016-01-04T07:51:02Z

Suggestions:

At line Fix SyntaxError: Missing parentheses in call to 'print' #43 : https://github.com/armbues/ioc_parser/blob/master/iocp.py#L43
- Replace with:

try:
    from StringIO import StringIO
except ImportError:
    from io import StringIO

pdfminer doesn't support python3, so I changed default library to 'pypdf2' at line #84:
- https://github.com/armbues/ioc_parser/blob/master/iocp.py#L84

def __init__(self, patterns_ini=None, ..., library='pypdf2', ...):

armbues · 2016-01-20T01:14:31Z

The default PDF library was switched to pdfminer because of the parsing better performance. In a head-to-head test it was able to parse considerably more text from a report set than pypdf2, therefore also generating more IOCs.

An option would be to dynamically check the Python version during runtime and accordingly change the default PDF library.

bernardyim · 2017-05-11T06:05:11Z

For anyone with issues with pdfminer on python3, consider using pdfminer.six, a fork for compatibility with python3
https://github.com/pdfminer/pdfminer.six

Also, as a totally unrelated side-note (no idea where to put this), you might want to set the re.compile flag to IGNORECASE, so that you can catch cases that are typed in all caps, at parser.py line 133:
ind_regex = re.compile(ind_pattern, flags=re.IGNORECASE)

fhightower · 2017-10-18T14:46:00Z

As far as IGNORECASE support is concerned, this is handled with #34.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python 3 compatible #21

Python 3 compatible #21

threatlead commented Jan 4, 2016

armbues commented Jan 20, 2016

bernardyim commented May 11, 2017

fhightower commented Oct 18, 2017

Python 3 compatible #21

Python 3 compatible #21

Comments

threatlead commented Jan 4, 2016

armbues commented Jan 20, 2016

bernardyim commented May 11, 2017

fhightower commented Oct 18, 2017