Final Project Introduction to Econometrics, Spring 2023
- Bing-Chen Chiu (邱秉辰)
- Double major in Information Management & Economics at NTU.
- Data preprocessing (using Python), visualization, and discuss regression models.
- Kai-Jyun Wang (王愷均)
- Major in Economics at NTU
- Establishment and implementation (using R) of regression models.
Renting/returning YouBike 2.0 in NTU campus can sometimes be troubling, and the situation of bike shortage at NTU has deteriorated in 2023 (Chin-Sum Shui, 2023). In this project, we try to understand the spatial patterns of ridership of YouBike at NTU and look for determining factors of riderships within and nearby NTU.
- 103 stations within and nearby NTU are manually selected.
- Data manipulation using Pandas 2.0.14
- Subset the renting records that occurred on weekdays and both renting/returning station are belong to the list of selected 103 stations.
- Obtains geographical information by Google Maps API
- latitude & longitude information
- bicycling-distance between two given places
- Nearest MRT station
- Nearest dorm
-
We split a single day into 4 time segments:
- A: Rent time between 07:00 - 10:59
- B: Rent time between 11:00 - 14:59
- C: Rent time between 15:00 - 18:59
- D: Rent time between 19:00 - 23:59
-
For each station, we have:
- The official name of this YouBike 2.0 station and its capacity
- The latitude & longitude of the station
- The total flow of each time segment
- The bicycling-distances to our selected landmarks
-
We also construct a table of bicycling-distances between each pair of stations.
-
Remark: The difinition of flow
A flow of station
$i$ at time segment$t$ is:$$FR_i^t = \frac{RE_i^t - RT_i^t}{C_i}$$ -
$RE_i^t$ denotes the total number of \textbf{rents} from station$i$ at time segment$t$ . -
$RT_i^t$ denotes the total number of \textbf{returns} to station$i$ at time segment$t$ . -
$C_i$ denotes the capacity of station$i$ . -
$i \in {0,...,103}, \ t \in {A,B,C,D}$ .
-
We use foursquare studio as our main visualization tool
- Red dot means the station tends to "flow out"
- Blue dot means the station tends to "flow in"
- The thickness of the line represents the ridership
Consider an OLS model
where
Consider a SLM.
- If there is a good reason to consider geographic factors in the model, the spatial autocorrelation can be captured.
- Our model shows some characteristics:
- Flow into the campus in the morning and flow out in the afternoon (MRT, conversely)
- Flow into the dorm area shows insignificant results.