Skip to content

Latest commit

 

History

History
25 lines (20 loc) · 1.18 KB

README.md

File metadata and controls

25 lines (20 loc) · 1.18 KB

Aspect Based Sentimental Analysis Codemix.

Data:

  • We have collected twitter codemix data. After preprocessing the data, we are left with 3500 tweets with Aspects annotated.

Progress:

  • We have divided our project into two parts - Aspect Identification and Sentimental Analysis.
  • Working on Aspect Identification -- Ongoing work.

Contents:

Parts of speech tagger for codemix:

  • Parts of speech plays an important role in Aspect Identication.
  • We have 1000 codemix tweets with annoatated parts-of-speech. We used this data for training.
  • We have trained the data using different Machine learning methods and achieved the following accuracies:
    • Conditional Random Fields(CRF): 77.185%
    • Trigram + Bigram + Unigram: 68.172%
    • Hidden Markov Method(HMM): 18.014%

Aspect Identification:

  • Till now, we applied CRF and SVM methods for identifying Aspects in a sentence.
  • CRF: We have used nearly 65 features(considering window length of 5) and achieved an accuracy of 44.73%.
  • SVM: We represented each word in the form of a vector and trained these vectors.

Contributors:

M R Abhishek and K Vagdevi