You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
python3.12/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
zsh: killed python neofuzztest.py
Code:
import random
import string
from neofuzz import char_ngram_process
def rand_str(length):
characters = string.ascii_letters + string.digits
return "".join(random.choice(characters) for _ in range(length))
names = [
rand_str(8) + " " + rand_str(6) + " " + rand_str(4) + " " + str(i)
for i in range(400_000)
]
print(len(names))
neofuzz_process = char_ngram_process()
neofuzz_process.index(names)
query = "test 3333"
pre_filter = neofuzz_process.extract(query, limit=2000, refine_levenshtein=True)
print(pre_filter[:10])
The blazing fast speed of this lib can only shine if working on large datasets.
The text was updated successfully, but these errors were encountered:
SeanPedersen
changed the title
neofuzz indexing fails for list of 500K strings
neofuzz indexing fails for list of 400K strings
Oct 26, 2024
hmm interesting... Thanks for taking your time to look into this. Can I get a full error log? I have a feeling this might have something to do with PyNNDescent
Error message:
Code:
The blazing fast speed of this lib can only shine if working on large datasets.
The text was updated successfully, but these errors were encountered: