You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# that 80 % of words in a document contain at least one alphabetic character
if (
self.max_non_alpha_words_ratio
and sum([any((c.isalpha() for c in w)) for w in words]) / n_words < self.max_non_alpha_words_ratio
):
return False, "gopher_below_alpha_threshold"
Given that all documents that have a LOWER ratio are removed, I would expect the variable name to be min_non_alpha_words_ratio, similar to all other variable names.
The text was updated successfully, but these errors were encountered:
In the Gopher filter, there's this filter
Given that all documents that have a LOWER ratio are removed, I would expect the variable name to be min_non_alpha_words_ratio, similar to all other variable names.
The text was updated successfully, but these errors were encountered: