-
-
Notifications
You must be signed in to change notification settings - Fork 215
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #648 from DarkRaiderCB/main
Automobiles Sales Data Analysis #576
- Loading branch information
Showing
13 changed files
with
4,934 additions
and
0 deletions.
There are no files selected for viewing
2,748 changes: 2,748 additions & 0 deletions
2,748
Automobile Sales Data Analysis/Dataset/Auto_Sales_data.csv
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Automobiles Sales Data Analysis | ||
|
||
The Dataset used here is taken from the Kaggle database website. You can download the file from the link given here, Automobile Sales data.(https://www.kaggle.com/datasets/ddosad/auto-sales-data) | ||
|
||
## Data Description | ||
The dataset used in this project includes the following columns: | ||
- `ORDERNUMBER`: Order number | ||
- `QUANTITYORDERED`: Quantity ordered | ||
- `PRICEEACH`: Price per item | ||
- `ORDERLINENUMBER`: Order line number | ||
- `SALES`: Total sales amount | ||
- `ORDERDATE`: Date of the order | ||
- `DAYS_SINCE_LASTORDER`: Days since the last order | ||
- `STATUS`: Order status | ||
- `PRODUCTLINE`: Product line | ||
- `MSRP`: Manufacturer's Suggested Retail Price | ||
- `PRODUCTCODE`: Product code | ||
- `CUSTOMERNAME`: Customer name | ||
- `PHONE`: Customer phone number | ||
- `ADDRESSLINE1`: Customer address line 1 | ||
- `CITY`: Customer city | ||
- `POSTALCODE`: Customer postal code | ||
- `COUNTRY`: Customer country | ||
- `CONTACTLASTNAME`: Contact last name | ||
- `CONTACTFIRSTNAME`: Contact first name | ||
- `DEALSIZE`: Deal size (Small, Medium, Large) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Sales Data Analysis and Prediction | ||
|
||
This project involves analyzing and predicting sales data. The dataset contains information about orders, products, customers, and sales figures. The goal is to clean, analyze, and model the data to gain insights and make predictions. | ||
|
||
## Table of Contents | ||
- [Overview](#overview) | ||
- [Data Description](#data-description) | ||
- [Data Cleaning](#data-cleaning) | ||
- [Exploratory Data Analysis](#exploratory-data-analysis) | ||
- [Modeling](#modeling) | ||
- [Conclusions](#conclusions) | ||
|
||
## Overview | ||
This project analyzes sales data to extract meaningful insights and build predictive models. The dataset used in this analysis includes various attributes related to orders, products, and customers. The main tasks involved in this project are: | ||
1. Data Cleaning | ||
2. Exploratory Data Analysis (EDA) | ||
3. Building Predictive Models | ||
4. Visualizing the results | ||
|
||
## Data Description | ||
The dataset used in this project includes the following columns: | ||
- `ORDERNUMBER`: Order number | ||
- `QUANTITYORDERED`: Quantity ordered | ||
- `PRICEEACH`: Price per item | ||
- `ORDERLINENUMBER`: Order line number | ||
- `SALES`: Total sales amount | ||
- `ORDERDATE`: Date of the order | ||
- `DAYS_SINCE_LASTORDER`: Days since the last order | ||
- `STATUS`: Order status | ||
- `PRODUCTLINE`: Product line | ||
- `MSRP`: Manufacturer's Suggested Retail Price | ||
- `PRODUCTCODE`: Product code | ||
- `CUSTOMERNAME`: Customer name | ||
- `PHONE`: Customer phone number | ||
- `ADDRESSLINE1`: Customer address line 1 | ||
- `CITY`: Customer city | ||
- `POSTALCODE`: Customer postal code | ||
- `COUNTRY`: Customer country | ||
- `CONTACTLASTNAME`: Contact last name | ||
- `CONTACTFIRSTNAME`: Contact first name | ||
- `DEALSIZE`: Deal size (Small, Medium, Large) | ||
|
||
## Data Cleaning | ||
The initial step involves cleaning the data to ensure accuracy and consistency. This includes: | ||
- Handling missing values | ||
- Converting data types | ||
- Standardizing date formats | ||
|
||
## Exploratory Data Analysis | ||
EDA involves visualizing and summarizing the main characteristics of the data. Key analyses include: | ||
- Distribution of sales | ||
- Sales trends over time | ||
- Top products by sales | ||
- Customer segmentation | ||
|
||
Examples: | ||
|
||
<img src="../Images/sales_by_day.png"> | ||
<img src="../Images/sales_by_country.png"> | ||
<img src="../Images/product_line.png"> | ||
|
||
## Modeling | ||
Several predictive models are built and evaluated to forecast sales and identify factors influencing sales. Models used include: | ||
- Linear Regression | ||
- Decision Trees | ||
- Random Forests | ||
- Gradient Boosting | ||
- SVR | ||
- K-neighbour Regressor | ||
- XGBoost | ||
|
||
<img src="../Images/model.png"> | ||
|
||
## Conclusions | ||
1. XGBoost performs the best. | ||
2. SVR performs the worst. |
Oops, something went wrong.