Skip to content

Find the locations with most pick-ups and lucrative trips in NYC using clustering analysis.

Notifications You must be signed in to change notification settings

scott198510/Where-should-a-taxi-driver-pickup-passengers

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Where-should-a-taxi-driver-pickup-passengers

Key Word: Cluster Analysis, Python, Google Maps API

Introduction

This project aims to analyze taxi data in New York City. It uses cluster anaysis to identify the locations with most pick-ups, and the locations generating most lucrative trips. The results are presented using google maps API. It can help taxi drivers to determine where they should wait for the passengers.

Data

NYC cab data is available from the NYC Taxi & Limousine Commission’s Trip Record Data site: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml.

In the demonstration code, March 2016 'Green cabs' data downloaded from above link is used.

Analysis

Use k-means cluster analysis to identify:

  • Pick-up locations with most pick-ups.
  • Pick-up locations of Most lucrative trips.

Here we define lucrative trips as those generating the highest fare for least amount time spent.

Code:

  • cluster_analysis.py --> location.csv
  • cat heatmap-start.txt > heatmap.html
  • python latlng.py location1.csv >> heatmap.html
  • cat heatmap-end.txt >> heatmap.html
  • open heatmap.html

Output

The interactive output can be found in googlemap repository.

  • Pick-up locations with most pick-ups. us_map 1

  • Pick-up locations of Most lucrative trips.

us_map 3

Reference: https://github.com/parrt/msan692/blob/master/notes/sfpd.md

About

Find the locations with most pick-ups and lucrative trips in NYC using clustering analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%