Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 592 Bytes

README.md

File metadata and controls

18 lines (13 loc) · 592 Bytes

A topic classifier




Approach:

  • mined 337 topics
  • for each topic, mined 100 documents
  • cataloged each document counting words
  • kept a running total of all words
  • kept an individual log of per/document words
  • calculated probabilities in a Markov fasion

Current Results:

  • currently running tests (ostensibly working)

Usage:

  • the main use case for this would be someone looking to disambiguate text in a large computational fashion
  • ideally, this module will server as a hook for larger scale purposes