Research Goal: Analysis and forecasting of trip duration for High Volume For-hire Services in NYC
Timeline: The timeline for the research area is 02/2019 - 07/2019 Inclusive.
To run the pipeline, please visit the notebooks
directory and run the files in order:
download.ipynb
: This downloads the raw data into thedata/raw
directory.preprocess.ipynb
: This notebook details all preprocessing steps and outputs it to thedata/curated
directory.analysis.ipynb
: This notebook is used to conduct analysis and visualisations on the curated data.modelling.ipynb
: The notebook is used to conduct hyperparameter tuning and run the model.
External Dataset: The external data is downloaded from the Weather Underground (https://www.wunderground.com/history/monthly/us/ny/new-york-city) and stored in data/raw/weather_data
directory.