Skip to content
This repository has been archived by the owner on Jan 31, 2023. It is now read-only.

Allow sparse input for naive bayes classifier #18

Open
cyan198 opened this issue Nov 23, 2021 · 0 comments
Open

Allow sparse input for naive bayes classifier #18

cyan198 opened this issue Nov 23, 2021 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@cyan198
Copy link

cyan198 commented Nov 23, 2021

I tried converting pipeline to pure_sklearn. The pipeline consist of TfidfVectorizer and MultinomialNB. The output of TfIdfVectorizer is sparse array as input to MultinomialNB. However, the naive bayes predict method does not support sparse array as input (X), as defined in the code below and thus throws error.

X = check_array(X, handle_sparse="error")

Possible solution
I'm not sure why the code above is necessary to reject sparse input. However I tried changing to allow sparse and tested it. I don't encounter any issue as the estimator works as expected.

X = check_array(X, handle_sparse="allow")

Is this the right way?

I've created a test method under test_pipeline to test this scenario. I can submit a PR if you want to review.

My dev environment:
Package Version


fasttext 0.9.2
numpy 1.21.4
pandas 1.3.4
pure-predict 0.0.4
pytest 6.2.5
scikit-learn 1.0.1
scipy 1.7.2

@cyan198 cyan198 added the enhancement New feature or request label Nov 23, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants