Free soundex out string on deallocation #17
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
context
possible fix for #14 (i'm new to cython and haven't coded in C for a long time)
currently
fuzzy.Soundex
has occasional undefined behavior on py3, as discussed in #14 and can be reproduced by running this script locally:the
UnicodeDecodeError
is easiest to replicate as it breaks the code flow, but as noted in #14 the function can return valid (but incorrect) python stringsearly deallocation of output string
i have a feeling this is occurring because of a change in python2/python3 cython behavior in this block:
fuzzy/src/fuzzy.pyx
Lines 227 to 230 in e15b195
pout = out
pout
has no declared type (from looking at compiled c code it seems to be achar *
)char *
start pointer copyout
free
iircpout
__pyx_t_5 = __Pyx_PyUnicode_FromString(__pyx_v_pout); if (unlikely(!__pyx_t_5)) __PYX_ERR(0, 230, __pyx_L1_error)
which seems to be doing somechar *
-> py string coercionUnicodeDecodeError
if attempting to decode garbage bytes, or some random string output if the bytes are valid as a python stringi'm not sure what would cause this to be an issue only on py3, though, unless soundex has been occasionally returning garbage chars without notice on py2 as no
UnicodeDecodeError
was being thrown?proposed fix
use
try...finally
deallocation syntax to free the output string on python garbage collection