Skip to content

A simple python repository for developing perceptron based text mining involving dataset linguistics preprocessing for text classification and extracting similar text for a given query.

License

Notifications You must be signed in to change notification settings

phamthuha0970/Text-Mining-with-TF-IDF-and-Cosine-Similarity

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text Mining with TF-IDF & Cosine Similarity

A simple python repository for developing perceptron based text mining involving dataset linguistics preprocessing for text classification and extracting similar text for a given query.

Outcomes,

  1. Perceptron training Confusion Matrix,

  1. Perceptron training with L2-Regularization Confusion Matrix,

  1. Top-5 'terms' having the most weights/ importance,

    term weight
    sträuchern 50.0088
    addon 45.3869
    runtergestuft 44.283
    nachgereicht 42.9979
    sensation 40.7419

    Please check the results.csv for the complete term weights.

Dependencies

Install dependencies using:

pip3 install -r requirements.txt 

Contact

About

A simple python repository for developing perceptron based text mining involving dataset linguistics preprocessing for text classification and extracting similar text for a given query.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%