naive-bayes-hw

Part I

Coding.

Remember the Bayes Dice app we build many weeks ago? Let's revisit that app, but with a twist.

You have 3 coins with the following probabilities. P(H|1) = 0.3, P(H|2) = 0.45, P(H|3) = 0.75.

That is read as the probability of heads for coin 1 is 30%, etc.

Write a small app, using Object Oriented Python, that allows you to randomly select a coin (without looking) and then repeatedly flip it about 10 times or so until you are fairly certain as to the type of coin you selected.

Part II

Questions.

In general, what makes the Naive Bayes Classifier so naive?

It is naive because it assumes all features are independent.

What is the difference between the Bernoulli, Gaussian and Multinomial Naive Bayes Classifiers?

Bernoulli is when features are 0 or 1
Multinomial is when features are counts
Gaussian is when features are continuous and normally distributed

Can you use the Naive Bayes Classifier if your features are not independent?

You shouldn't, as that's the primary assumption before using this classifier.

Part III

Models.

Take this data. https://github.com/gSchool/dsi-logistic-regression/blob/g79/data/grad.csv

Predict whether someone will get into grad school. Use the following models.

Logistic Regression
Random Forest
Naive Bayes (you will need to figure out what type works best for this data)

Which model performed the best?

Part IV

Text Classification.

Remember this assignment.

https://github.com/data-science-ml/tweets-nlp-assignment/blob/master/nlp-assignment.md

Take the above tweets and turn them into a bag of words. Use a Naive Bayes classifier to figure out if a particular tweet is Neutral, Negative or Positive. Remember to split your data.

What is the accuracy of your model?

Compare this model to a KNN model (neighbors == 3) where each tweet is a 300 dimensional vector.

Which model performs better?

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
README.md		README.md
bayescoin.py		bayescoin.py
grad.csv		grad.csv
main.py		main.py
part-iii.ipynb		part-iii.ipynb
part-iv.ipynb		part-iv.ipynb
tweets.csv		tweets.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

naive-bayes-hw

Part I

Part II

Part III

Part IV

About

Uh oh!

Releases

Packages

Languages

data-science-ml/naive-bayes-hw

Folders and files

Latest commit

History

Repository files navigation

naive-bayes-hw

Part I

Part II

Part III

Part IV

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages