Skip to content

This is a capstone project for FTW Foundation Data Science program that predicts the price of second-hand cars, and created by Elyse Go, Nicole Lumagui, Bernadette Misa and Jero Santos.

Notifications You must be signed in to change notification settings

nicolelumagui/FTW-Capstone_Second-Hand-Cars-in-PH

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction


About

Second-Hand Cars in the PH: A Buyer's Guide is a capstone project presented for FTW Foundation Data Science Program that predicts the prices of second-hand/used cars in the Philippines. This is created by Elyse Go, Nicole Lumagui, Bernadette Misa and Jero Santos, and sponsored by Carlove.


See our model in action here: http://nicolelumagui.pythonanywhere.com/

(This site will be disabled on Wednesday 26 February 2020)


Introduction

Traffic. Congestion. Carmaggedon. Metro Manila is one of the densest cities. What is one of the government’s way of solving this? The TRAIN Law. Specifically, Auto Tax Reform. Modeled after Singapore, TRAIN Law is put in place to limit “new” cars on the road. Because of this, people are discouraged to buy brand new cars to help alleviate traffic. And because of this, a window of opportunity opens: an increase in market activity for second-hand cars.

Buying used cars has many advantages:

  1. More savings
  2. Cheaper insurance cost
  3. Slower depreciation
  4. Extended warranty
  5. Good for the environment

But there are also risks:

  1. Unknown reliability or treatment
  2. More frequent maintenance
  3. Hard to find an exact match of what you want
  4. Untouched warranty
  5. Lemon Car / Overpriced Car

How can we help consumers in their journey of buying a second-hand car?

By empowering them with information derived from Machine Learning!


Process Overview

  1. Scraped data from the websites Carmudi, Philkotse, Priceprice Auto and AutoSearch Manila
  2. Joined the scraped data sets and cleaned the data
  3. Used K-Nearest Neighbors to impute for mileage
  4. Exploratory Data Analysis
  5. Tried Decision Trees to get the first glance on feature importance
  6. Used Random Forest Regressor then XGBoost to predict the price of second-hand cars
  7. Created Web App using Django and applying the pickled Random Forest Regressor model on backend

Dataset

The data used for this project are car listings scraped from Carmudi and Philkotse. While, the retail price of the cars are scraped from Priceprice Auto and AutoSearch Manila.

Features:

  • Age of car
  • Retail Price of Car
  • Mileage of Car in km
  • Car Brand/Make - Toyota, Honda, Hyundai, Ford, etc...
  • Car Model - Civic, Adventure, Ranger, Vios, etc...
  • Car Body Type - Saloon/Sedan, Hatchbak/Wagon, SUV, MPV/AUV, etc...
  • Fuel Type of Car - Diesel, Gasoline, Electric
  • Transmission Type of Car - Automanual, Automatic, CVT, Manual, Shiftable Automatic
  • Car's Color
  • Seller Type - Individual/Private Owner or Dealer
  • Seller's Location/City
  • Age of Post/Listing in Days

Results / Output

Machine Learning Model

Model Cross-Val Score
XGBoost 79.63%
Random Forest Regressor 80%

Graph of the results of Random Forest Regressor model

Graph of the Results of Random Forest Regressor model

Feature Importance

Feature Importance based on Random Forest Regressor model


Web App Prototype

A Look on the Web App Prototype


Recommendations

  • Add more data from other marketplaces and banks, and compile a dataset with balanced distribution of car brands and models.
  • Improve user interface of web app.
  • More fine tuning of the model.
  • Add more features, such as number of doors.

References


Contact Us


Special Thanks

Releases

No releases published

Packages

No packages published