Skip to content

Latest commit

 

History

History
64 lines (42 loc) · 2 KB

README.md

File metadata and controls

64 lines (42 loc) · 2 KB

PortfolioProject

Table of contents

Project Overview

This project use the Movie dataset from Kaggle. And proceed the data using Python with Pandas package. And visualize the data for Exploratory Data Analysis (EDA) using Matplotlib + Seaborn package. This dataset is collcted for the movie relased on or before July 2017 there are multiple file in this dataset like Cast, Crew, Movies Metadata etc. Which hold information for movie and revenue and rating from the user collect from GroupLens.

In this project I aim to load and clean the data. Find any intesting fact and trend in the movie produce.

Prerequisite

To reproduce this you will need Kaggle account and API Key

1.Go to Kaggle and create an account

2.Go to Setting and scoll down you will see a API section select Create New Token

3.Once the file is downloaded create a .kaggle folder in your profile folder and paste the file there ~/.kaggle/kaggle.json

4.Install Kaggle CLI

pip install Kaggle

You will also need the below package.

  1. Pandas
  2. Matplotlib
  3. Seaborn
  4. Numpy

Reproducibility

Clone this project

git clone https://github.com/Chalermdej-l/PortfolioProject

Access the clone directory

cd PortfolioProject

Intall the require package by

pip install -r requirements.txt

Download the data from keggle

kaggle datasets download -d rounakbanik/the-movies-dataset -p data --unzip –force

Once the package is install please access the file and run the code.