Arabic Dialect Classification

Introduction

Using tfidf vectorizer, Logistic Regression, and other Machine Learning models aside with preprocessing techniques used for Arabic Language Tweets to classify Arabic Dialect from text.
Using AraBERT model version 2 for Deep Learning approach and comparing results with Machine Learning Approach using Confussion Matrix, F1-score.

for more info about the repo please check pdf slides, and check Models Directory for results

PyTorch
Pandas
matplotlib
scikit-learn
transformers
pyarabic
emoji
nltk

Make sure you activate env where all packages are downloaded.
Run ModelTraining_ML.ipynb, ModelPrediction-AraBert.ipynb to get models pickle files
After that go the saved pickle files and copy (or cut) paste to static folder for the FastAPI server within folders for ML models, and other for AraBERT model.

static
│   ├───ML_models
│   └───output_dir

Run python main.py
After running your server go to localhost:5000/docs in browser, and try out different POST methods with different text
Enjoy your server app

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Models		Models
ArabicDialectSentimenalAnalysis.pdf		ArabicDialectSentimenalAnalysis.pdf
Data-Fetching.ipynb		Data-Fetching.ipynb
Data-Preprocessing.ipynb		Data-Preprocessing.ipynb
ModelPrediction-AraBert.ipynb		ModelPrediction-AraBert.ipynb
ModelTraining-AraBert.ipynb		ModelTraining-AraBert.ipynb
ModelTraining_ML.ipynb		ModelTraining_ML.ipynb
POST_Example.png		POST_Example.png
POST_Example2.png		POST_Example2.png
README.md		README.md
dialect-classes-dist.png		dialect-classes-dist.png
main.py		main.py
utils.py		utils.py