Data_Visualization Final Challenge

Ride Share Analysis

Manrique Vargas [email protected]

Yavuz Sunor [email protected]

Summary

We used a combination of matplotlib and geopandas for data processing. Then we visualized the maps using Carto. We considered this approach the most cost-effective solution.

Q1

For the first question, we've worked on the requests dataset. We did the necessary manipulations using Python in Jupyter Notebook. We changed the timestamp to datetime to see it in hourly way. We calculated succesful trip ratios aggrageting in hourly fashion and plotted the trend for one day period. As you can see, serving rate reaches its peak in the morning rush hours and follows a uniform pattern in other times.

Q2

Carto map link

For the second question, we also used the requests dataset but this time we merged it with manhattan.geojson to visualize on a map. We created a served/not_served column in Jupyter notebook for categorizing trips. We then browsed the merged dataframe to CartoDB as a csv file. We wrote a simple SQL query to filter not_served trips and aggregating trips by each geolocation. As you can see, most of not_served trips concentrated near Eastern Manhattan mostly in Midtown and Central Park area. In terms of temporality, we see early hours in WTC area and late hours near Midtown/Central Park area.

Q5

Carto map link

For the fifth question, we used vehicle_paths dataset. Using Jupyter Notebook, we filtered the data for the number of passengers below and equal 4 and created a geometry column out of Latitude and Longitudes. Then we browsed the dataframe to CartoDB as in the question 2. We applied a very similar SQL query and were able to visualize vehicles in terms of passenger numbers and time of the day. Because the maximum time limited up to 2pm for number of passengers below 4, we can only see a temporal pattern between 5am and 2pm. As you can see, both in terms of passenger numbers and time of the day, the city shows a uniform distribution.

Q6

Carto map link
Jupyter Notebook here

For the sixth question, we calculated the speed of every vehicle. We used pandas to analyze each vehicle individually. To increase the speed of our algorithm, we took subsamples every 5 records. Distance divided by time yields the speed. The plot shows the speeds that are 4.5 standard deviations below the mean speed. This included not only vehicles with a speed of zero (not moving at all), but also we observe some vehicles with very low speed. This approach allows us to account for uncertainty errors in the system that was used to record the location. 4.5 standard devations below the mean speed accounts for many cases where the vehicle might not be moving but the signal fluctuates slightly by a few meters causing the speed to be larger than zeros. We also provide a GIF to better visualize the temporal change per hour and the spatial change using the maps.

Q8

The most important information are the paths and the pickups and drop offs. This map allows us to visualize them all. Some points are also shown which are not close to the path trajectory of the vehicle. It might be possible that these points were failed requests. We also provide a GIF to better visualize the temporal change per hour and the spatial change using the maps.

Jupyter Notebook here

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
Data		Data
Notebooks		Notebooks
Plots		Plots
Q1.png		Q1.png
Q6.png		Q6.png
README.md		README.md
not_served.png		not_served.png
sql.png		sql.png
vehicle_utilized.png		vehicle_utilized.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data_Visualization Final Challenge

Ride Share Analysis

Manrique Vargas [email protected]

Yavuz Sunor [email protected]

Summary

Q1

Q2

Q5

Q6

Q8

About

Releases

Packages

Languages

machov/Data_VIZ

Folders and files

Latest commit

History

Repository files navigation

Data_Visualization Final Challenge

Ride Share Analysis

Manrique Vargas [email protected]

Yavuz Sunor [email protected]

Summary

Q1

Q2

Q5

Q6

Q8

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages