Bannatore is a Telegram bot that allows automatic moderation of Telegram chat groups, using a LLM. Developed for the "Technologies for Advanced Programming" course exam, edition 2023/24, of CS degree at University of Catania.
Modern Telegram group chats are very crowded, but they lack of moderation. Human moderators are expensive, and are exposed to poor conditions during their moderation work (many hours also during night, and low income). This bot aims to offer an automatic and AI-powered alternative to human moderators, with less costs and better performances.
- python-telegram-bot: A Python library that provides an interface for the Telegram Bot API.
- Fluentd: A data ingestion technology that allow log ingestion from and to different sources.
- Kafka: Distributed event streaming platform capable of handling a high number of real-time events.
- Apache Spark Structured Streaming: Stream processing engine that process data coming from Kafka using Spark DataFrame APIs.
- Zookeeper: Used as a centralized service for managing and coordinating nodes within the Kafka cluster.
- Apache Spark: Open-source unified, distributed analytics engine designed for large-scale data processing.
- transformers: ML library provided from Huggingface for training and using pretrained transformers models.
- Elasticsearch: Distributed, RESTful search and analytics engine; provides efficient data indexing and storing.
- Kibana: Kibana is used for creation of interactive and complex dashboard, powered by an Elasticsearch index.
- Apache Kafka: download from this link and put it in kafka/setup directory.
- Apache Spark: download from this link and put it in spark/setup directory.
- Model tapmodelv2_bert: download from this link and extract it in spark directory.
-
Click this link select your OS and follow the instructions to install Docker. If you are on Windows or MacOS, follow this link to install Docker Desktop.
-
Run
sudo apt-get update
andsudo apt-get install docker-compose-plugin
to install Docker Compose (it's also included in Docker Desktop). -
Run
docker network create --subnet=10.0.100.0/24 tap
to create container virtual network. -
Run
docker compose up -d --build
and, after building is completed, wait about 10/15 seconds to complete the containers internal configuration.
Kibana visualizations are available at http://10.0.100.27:5601/.
- https://github.com/jing-qian/A-Benchmark-Dataset-for-Learning-to-Intervene-in-Online-Hate-Speech/tree/master
- https://github.com/t-davidson/hate-speech-and-offensive-language/tree/master/data
- https://www.kaggle.com/datasets/arkhoshghalb/twitter-sentiment-analysis-hatred-speech?select=train.csv
- https://github.com/hate-alert/HateXplain/tree/master/Data