Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array2D_init does not validate memory allocation #29

Open
maxbachmann opened this issue Aug 19, 2022 · 0 comments
Open

Array2D_init does not validate memory allocation #29

maxbachmann opened this issue Aug 19, 2022 · 0 comments

Comments

@maxbachmann
Copy link

maxbachmann commented Aug 19, 2022

Array2D_init currently has the following implementation:

cdef inline void Array2D_init(
    Array2D* array2d,
    Py_ssize_t num_rows,
    Py_ssize_t num_cols) nogil:
    """
    Initializes an Array2D struct with the given number of rows and columns
    """
    array2d.num_rows = num_rows
    array2d.num_cols = num_cols
    array2d.mem = <DTYPE_t*> malloc(num_rows * num_cols * sizeof(DTYPE_t))

Problems with the implementation

  1. this does not validate whether malloc succeeds. E.g. for:
from weighted_levenshtein import dam_lev
s1="dkjnsdbjkadbjkalsask"*10000
s2="ksjdbhajhsjadksjaj"*10000
dam_lev(s1, s2)

this leads to a segmentation fault on my machine:

==169035== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==169035==  Access not within mapped region at address 0x0
==169035==    at 0x1347A76B: __pyx_f_20weighted_levenshtein_4clev_c_damerau_levenshtein (clev.c:4932)
==169035==    by 0x13493203: __pyx_pf_20weighted_levenshtein_4clev_damerau_levenshtein (clev.c:4810)
==169035==    by 0x13493203: __pyx_pw_20weighted_levenshtein_4clev_1damerau_levenshtein (clev.c:4537)
==169035==    by 0x48DDCD8: UnknownInlinedFun (abstract.h:118)
==169035==    by 0x48DDCD8: UnknownInlinedFun (abstract.h:127)
==169035==    by 0x48DDCD8: UnknownInlinedFun (ceval.c:5077)
==169035==    by 0x48DDCD8: _PyEval_EvalFrameDefault.cold (ceval.c:3520)
==169035==    by 0x4993471: UnknownInlinedFun (pycore_ceval.h:40)
==169035==    by 0x4993471: function_code_fastcall (call.c:330)
==169035==    by 0x48DEA16: UnknownInlinedFun (abstract.h:118)
==169035==    by 0x48DEA16: UnknownInlinedFun (abstract.h:127)
==169035==    by 0x48DEA16: UnknownInlinedFun (ceval.c:5077)
==169035==    by 0x48DEA16: _PyEval_EvalFrameDefault.cold (ceval.c:3489)
==169035==    by 0x498C55E: UnknownInlinedFun (pycore_ceval.h:40)
==169035==    by 0x498C55E: _PyEval_EvalCode (ceval.c:4329)
==169035==    by 0x49931C0: _PyFunction_Vectorcall (call.c:396)
==169035==    by 0x48DE8F6: UnknownInlinedFun (abstract.h:118)
==169035==    by 0x48DE8F6: UnknownInlinedFun (abstract.h:127)
==169035==    by 0x48DE8F6: UnknownInlinedFun (ceval.c:5077)
==169035==    by 0x48DE8F6: _PyEval_EvalFrameDefault.cold (ceval.c:3506)
==169035==    by 0x498C55E: UnknownInlinedFun (pycore_ceval.h:40)
==169035==    by 0x498C55E: _PyEval_EvalCode (ceval.c:4329)
==169035==    by 0x49931C0: _PyFunction_Vectorcall (call.c:396)
==169035==    by 0x498DE9A: UnknownInlinedFun (abstract.h:118)
==169035==    by 0x498DE9A: UnknownInlinedFun (abstract.h:127)
==169035==    by 0x498DE9A: call_function (ceval.c:5077)
==169035==    by 0x48DEE21: _PyEval_EvalFrameDefault.cold (ceval.c:3537)
  1. num_rows * num_cols * sizeof(DTYPE_t) can overflow which leads to an incorrect memory allocation and afterwards out of bounds accesses. E.g.
from weighted_levenshtein import dam_lev
a="a"*1518500248
b="b"*1518500248
dam_lev(a, b)

reads out of bound:

==30861== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==30861==  Access not within mapped region at address 0x2E71CED10
==30861==    at 0x1306842B: __pyx_f_20weighted_levenshtein_4clev_c_damerau_levenshtein (clev.c:3444)
==30861==    by 0x1308060D: __pyx_pf_20weighted_levenshtein_4clev_damerau_levenshtein (clev.c:3304)
==30861==    by 0x1308060D: __pyx_pw_20weighted_levenshtein_4clev_1damerau_levenshtein (clev.c:3112)
==30861==    by 0x498F8F0: cfunction_call (methodobject.c:543)
==30861==    by 0x498B947: _PyObject_MakeTpCall (call.c:215)
==30861==    by 0x4988535: UnknownInlinedFun (abstract.h:112)
==30861==    by 0x4988535: UnknownInlinedFun (abstract.h:99)
==30861==    by 0x4988535: UnknownInlinedFun (abstract.h:123)
==30861==    by 0x4988535: UnknownInlinedFun (ceval.c:5891)
==30861==    by 0x4988535: _PyEval_EvalFrameDefault (ceval.c:4213)
==30861==    by 0x4982092: UnknownInlinedFun (pycore_ceval.h:46)
==30861==    by 0x4982092: _PyEval_Vector (ceval.c:5065)
==30861==    by 0x49FDE83: PyEval_EvalCode (ceval.c:1134)
==30861==    by 0x4A2F2B2: run_eval_code_obj (pythonrun.c:1291)
==30861==    by 0x4A2A7D9: run_mod (pythonrun.c:1312)
==30861==    by 0x48FD1CF: pyrun_file.cold (pythonrun.c:1208)
==30861==    by 0x4A24AD8: _PyRun_SimpleFileObject (pythonrun.c:456)
==30861==    by 0x4A24897: _PyRun_AnyFileObject (pythonrun.c:90)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant