This project involves building a machine learning model to classify emails as spam or not spam. The model is trained using a dataset from Kaggle and is implemented in a Jupyter Notebook using Logistic Regression. Additionally, a Flask API is provided for interfacing with the trained model to classify emails. The API can be accessed locally by running the Flask server or through the hosted version here.
You can explore the frontend application and test its functionality by visiting the hosted site here. The source code for the frontend is available in a separate repository, which can be found here.
The dataset used for training the model is sourced from Kaggle and can be found in datasets/emails.csv
. The model, trained on this dataset, achieves an accuracy of 98.34% on the test data.
- Python: The core programming language used to develop the machine learning model and the Flask API.
- Jupyter Notebook: Used for implementing and testing the machine learning model.
- Flask: A lightweight WSGI web application framework used to create the API.
- scikit-learn: The machine learning library used to build and train the logistic regression model.
- matplotlib: A plotting library used for visualizing data and model performance.
- pandas: A data manipulation and analysis library used for handling datasets, including loading, cleaning, and preprocessing data.
- numpy: A library for numerical computations, used for handling arrays and performing mathematical operations.
EmailSpamClassifier.ipynb
: Jupyter Notebook containing the implementation of the spam classifier model.models/
: Directory where the trained model and feature extractor are saved.datasets/
: Directory where the datasets are stored.app.py
: Flask API for interfacing with the trained model.requirements.txt
: Python package dependencies.
- Access the API at https://bilalm14.pythonanywhere.com/.
- API Endpoints:
- GET / predict: Classify an email as spam or not spam.
- Request Body:
{ "message": "Your email content here." }
- Response:
{ "message": "Your email content here.", "prediction": "spam or not spam" }
- Request Body:
- GET / predict: Classify an email as spam or not spam.
- Example Request:
- Request: https://bilalm14.pythonanywhere.com/predict?message=click%20here%20to%20win%20free%20prize
{ "message": "click here to win free prize" }
- Response:
{ "message": "click here to win free prize", "prediction": "spam" }
- Request: https://bilalm14.pythonanywhere.com/predict?message=click%20here%20to%20win%20free%20prize
- Clone the repository.
git clone https://github.com/BilalM04/email-spam-classifier.git
- Naviagte to the project directory.
cd email-spam-classifier
- Ensure you have Python and Jupyter Notebook installed.
- Install project dependencies.
pip install -r requirements.txt
- Launch the Jupyter Notebook server by running the following command.
jupyter notebook
- Open
EmailSpamClassifier.ipynb
in the Jupyter Notebook server. - Edit
input_mail
to test your own input.input_mail = [""]
- Execute the code to see the result.
- Start the Flask server.
python app.py
- Use the same endpoints as described in the API Interface section, but for local use, the URL root will be
http://127.0.0.1:####/
, where####
is the port number.
The frontend for this project is a web application built using React.js and styled with CSS. It allows users to input email messages and receive a classification of whether the email is spam or not. The frontend communicates with this backend API to utilize the machine learning model for classification. You can explore the frontend application and test its functionality by visiting the hosted site here. The source code for the frontend is available in a separate repository, which can be found here.