Hate Speech Classification

This is my attempt to replicate the paper "Hate Speech Dataset from a White Supremacy Forum" released in 2018 https://www.aclweb.org/anthology/W18-5102/

Experimental setting as described in the paper:

The experiments are based on a balanced subset of labelled sentences. All the sentences labelled as HATE have been collected, and an equivalent number of NOHATE sentences have been randomly sampled, summing up 2k labelled sentences. From this amount, the 80% has been used for training and the remaining 20% for testing.

The evaluated algorithms are the following:

Support Vector Machines (SVM) (Hearst et al., 1998) over Bag-of-Words vectors. Word-count-based vectors have been computed and fed into a Python Scikit-learn LinearSVM11 classifier to separate HATE and NOHATE instances.
Convolutional Neural Networks (CNN), as described in (Kim, 2014). The implementation is a simplified version using a single input channel of randomly initialized word embeddings12.
Recurrent Neural Networks with Long Shortterm Memories (LSTM) (Hochreiter and Schmidhuber, 1997). A LSTM layer of size 128 over word embeddings of size 300.

Result Comparison

Results excluding relation label

                             Results from Paper       ||    My Implementation        |
Model                 |   Hate    |  NoHate  |  All   ||  Hate  |  NoHate |  All     |
Logistic Regression   |           |          |        ||  x.xx  |  x.xx   |  x.xx    |
SVM                   |   0.72    |   0.76   |  0.74  ||  x.xx  |  x.xx   |  0.6875  |
CNN                   |   0.54    |   0.86   |  0.70  ||  x.xx  |  x.xx   |  0.72    |
LSTM                  |   0.76    |   0.80   |  0.78  ||  0.71  |  0.68   |  0.70    |

Results including relation label

                             Results from Paper       ||    My Implementation        |
Model                 |   Hate    |  NoHate  |  All   ||  Hate  |  NoHate |  All     |
Logistic Regression   |           |          |        ||  x.xx  |  x.xx   |  x.xx    |
SVM                   |   0.69    |   0.73   |  0.71  ||  x.xx  |  x.xx   |  0.7025  |
CNN                   |   0.55    |   0.79   |  0.66  ||  x.xx  |  x.xx   |  0.67    |
LSTM                  |   0.71    |   0.75   |  0.73  ||  x.xx  |  x.xx   |  x.xx    |

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Dump		Dump
Hate Speech - Linear SVM.ipynb		Hate Speech - Linear SVM.ipynb
Hate Speech - Linear SVM_all.ipynb		Hate Speech - Linear SVM_all.ipynb
Hate_Speech - CNN.ipynb		Hate_Speech - CNN.ipynb
Hate_Speech-Logistic_Regression.ipynb		Hate_Speech-Logistic_Regression.ipynb
Hate_Speech_Classification.pdf		Hate_Speech_Classification.pdf
README.md		README.md
nepali_ner_bilstm_cnn-batch-1-Copy1.ipynb		nepali_ner_bilstm_cnn-batch-1-Copy1.ipynb
raw.csv		raw.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hate Speech Classification

Experimental setting as described in the paper:

Result Comparison

About

Releases

Packages

Languages

Sandesh10/Hate-Speech-Classification

Folders and files

Latest commit

History

Repository files navigation

Hate Speech Classification

Experimental setting as described in the paper:

Result Comparison

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages