A Naiive Bayes Classifier that takes in Yelp reviews as input to determine whether the reviews are positive or negative.
This step converts each sentence in into a feature vector (one-hot encoding) and each sentence are represented as a "bag of words".
- strip the punctuation
- convert train and test set into one hot encoding
- output the preprocessed data into two saperate files
In the terminal do python bayes.py
Training accuracy: 0.9679 Test accuracy: 0.7746