This project performs sentiment analysis on IMDB movie reviews using Natural Language Processing (NLP) techniques and machine learning. It also includes a user-friendly GUI application built with Tkinter to analyze movie reviews interactively.
This project uses machine learning models and NLP techniques to classify movie reviews as Positive or Negative based on their sentiment. The project includes:
- A Python script for training and testing the models.
- A GUI application for interactive sentiment analysis.
- Cleans and preprocesses text data using:
- Stopword removal
- Stemming with the
PorterStemmer
- Visualizes dataset distributions with Matplotlib.
- Trains and evaluates multiple Naive Bayes classifiers:
- MultinomialNB
- GaussianNB
- BernoulliNB
- GUI application built with Tkinter to allow real-time analysis of reviews.
- Models and vectorizers are serialized with Pickle for reusability.
- Programming Language: Python
- Libraries:
- Pandas
- NumPy
- NLTK
- Scikit-learn
- Matplotlib
- Pickle
- Tkinter
The project uses the IMDB Dataset, which contains 50,000 labeled movie reviews:
- Columns:
review
: The text of the movie review.sentiment
: The sentiment of the review (Positive/Negative).
- Positive Reviews: 25,000
- Negative Reviews: 25,000
Follow these steps to set up the project locally:
- Clone the repository:
git clone https://github.com/VedantSinghPundir/Movie-Sentimental-Analysis.git
- Navigate to the project directory:
cd movie-sentiment-analysis
- Install the required dependencies:
pip install -r requirements.txt
- Download the IMDB dataset and save it in the project directory as
IMDB Dataset.csv
.
- Run the script to preprocess the data and train models:
python movie_sentiment_analysis.py
- Use the saved model to classify new reviews:
from test_model import test_model # Example usage print(test_model("This is the best movie I have ever seen!")) # Output: Positive review print(test_model("The movie was terrible and boring.")) # Output: Negative review
-
Launch the GUI application :
sentiment_analysis_app.py
-
Enter a movie review in the input box and click Analyze.
-
View the result (Positive/Negative sentiment) in a pop-up window.
The trained models achieved the following accuracy on the test set:
- MultinomialNB: 85.18%
- GaussianNB: 78.55%
- BernoulliNB (Best Model): 85.25%
- BernoulliNB: 85.27%