From Individuals to Communities: Community-Aware Language Modeling for the Detection of Hate Speech

To avoid environment issues you can create a new conda env using the following line:

conda create --name <env> --file requirements.txt

This project is divided into two main segments:

Hate Speech Detection - under the module detection
Hate Networks - under the model hate_networks

Hate Speech Detection

This is the main module of the thesis. It contains both the post level models (PLMs) and the user level models (ULMs) to detect hate speech.

The configuration of the execution for this module is under the file config.detection_config.py. Under this file you can control what dataset is running for the specific execution.

There are four entry points for this module. To execute each of the following experiments from the root dir of the project run:

`python detection/experiments/{experiment_file_name}.py`

The entry points reside under experiments directory in the following files:

Post-level experiments
- post_level__experiment.py - run this file to execute a specific experiment of a model on a specific data set as configured in config.detection_config.py. To run this file set the parameter multiple_experiments to False under post_level_execution_config in config.detection_config.py file.
- post_level__multiple_experiments.py - run this file to execute multiple experiments of several models and compare them. To run this file set the parameter multiple_experiments to True under post_level_execution_config in config.detection_config.py file.
Important note: to run BertFineTuning model make sure you are using a gpu.
User-level experiments
- user_level__experiments.py - run this file to execute the user level experiment. It will load the PLM from the post_level_execution_config config and predict the probabilities of all of the posts by all users in the given data to contain hate speech. Then it will run the FFNN using the streams of data by the user itself, his followees and his followers, together with network features.
  - Important note: one must run the hate network module with the desired dataset before running this code as it uses some output from it.
- user_level__threshold_models.py - Run this file to execute threshold-based ULMs.
  - Important note: one must run the experiment file user_level__experiments.py with the desired dataset before running this code as it uses some output from it.
The outputs of the models' executions will be saved under detection/outputs/{data_name}_{model_name}.

Hate Networks

In this module we create networks promoting hate using data from social networks that present engagement between users, i.e., mentions, retweets. Using this module you can create unsupervised segmentation of users to various communities based on the set configuration. The unsupervised methods are topic models (LDA/NMF) and Word2Vec modeling based on the users' texts. Using these methods you are able to color the users in the reconstructed user-network. You can also color the users using the predictions created using the user-classifier from the detection module. This can be done by setting the param plot_supervised_networks to true in general_conf under hate_networks_config.py file and using the user_pred param under path_conf for each dataset.

Important notes:

The entry point for this module is the function main() under the file main.py. To execute the code from the root dir of the project run:

python hate_networks/main.py
The execution of this code is configured using the configuration dict under the config module in the file hate_networks_config.py.
The outputs of the constructed network and the relevant files that are created with it will be saved under hate_networks/outputs/{data_name}_networks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

From Individuals to Communities: Community-Aware Language Modeling for the Detection of Hate Speech

Hate Speech Detection

Hate Networks

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
config		config
data		data
detection		detection
hate_networks		hate_networks
utils		utils
README.md		README.md
requirements.txt		requirements.txt

Eyal8/hate_speech_detection

Folders and files

Latest commit

History

Repository files navigation

From Individuals to Communities: Community-Aware Language Modeling for the Detection of Hate Speech

Hate Speech Detection

Hate Networks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages