|
| 1 | +The dataset comes from the public repository hosted at https://archive.ics.uci.edu/ml/datasets/bike+sharing+dataset |
| 2 | + |
| 3 | +========================================== |
| 4 | +Bike Sharing Dataset |
| 5 | +========================================== |
| 6 | + |
| 7 | +Hadi Fanaee-T |
| 8 | + |
| 9 | +Laboratory of Artificial Intelligence and Decision Support (LIAAD), University of Porto |
| 10 | +INESC Porto, Campus da FEUP |
| 11 | +Rua Dr. Roberto Frias, 378 |
| 12 | +4200 - 465 Porto, Portugal |
| 13 | + |
| 14 | + |
| 15 | +========================================= |
| 16 | +Background |
| 17 | +========================================= |
| 18 | + |
| 19 | +Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return |
| 20 | +back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return |
| 21 | +back at another position. Currently, there are about over 500 bike-sharing programs around the world which is composed of |
| 22 | +over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic, |
| 23 | +environmental and health issues. |
| 24 | + |
| 25 | +Apart from interesting real world applications of bike sharing systems, the characteristics of data being generated by |
| 26 | +these systems make them attractive for the research. Opposed to other transport services such as bus or subway, the duration |
| 27 | +of travel, departure and arrival position is explicitly recorded in these systems. This feature turns bike sharing system into |
| 28 | +a virtual sensor network that can be used for sensing mobility in the city. Hence, it is expected that most of important |
| 29 | +events in the city could be detected via monitoring these data. |
| 30 | + |
| 31 | +========================================= |
| 32 | +Data Set |
| 33 | +========================================= |
| 34 | +Bike-sharing rental process is highly correlated to the environmental and seasonal settings. For instance, weather conditions, |
| 35 | +precipitation, day of week, season, hour of the day, etc. can affect the rental behaviors. The core data set is related to |
| 36 | +the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA which is |
| 37 | +publicly available in http://capitalbikeshare.com/system-data. We aggregated the data on two hourly and daily basis and then |
| 38 | +extracted and added the corresponding weather and seasonal information. Weather information are extracted from http://www.freemeteo.com. |
| 39 | + |
| 40 | +========================================= |
| 41 | +Associated tasks |
| 42 | +========================================= |
| 43 | + |
| 44 | + - Regression: |
| 45 | + Predication of bike rental count hourly or daily based on the environmental and seasonal settings. |
| 46 | + |
| 47 | + - Event and Anomaly Detection: |
| 48 | + Count of rented bikes are also correlated to some events in the town which easily are traceable via search engines. |
| 49 | + For instance, query like "2012-10-30 washington d.c." in Google returns related results to Hurricane Sandy. Some of the important events are |
| 50 | + identified in [1]. Therefore the data can be used for validation of anomaly or event detection algorithms as well. |
| 51 | + |
| 52 | + |
| 53 | +========================================= |
| 54 | +Files |
| 55 | +========================================= |
| 56 | + |
| 57 | + - Readme.txt |
| 58 | + - hour.csv : bike sharing counts aggregated on hourly basis. Records: 17379 hours |
| 59 | + - day.csv - bike sharing counts aggregated on daily basis. Records: 731 days |
| 60 | + |
| 61 | + |
| 62 | +========================================= |
| 63 | +Dataset characteristics |
| 64 | +========================================= |
| 65 | +Both hour.csv and day.csv have the following fields, except hr which is not available in day.csv |
| 66 | + |
| 67 | + - instant: record index |
| 68 | + - dteday : date |
| 69 | + - season : season (1:springer, 2:summer, 3:fall, 4:winter) |
| 70 | + - yr : year (0: 2011, 1:2012) |
| 71 | + - mnth : month ( 1 to 12) |
| 72 | + - hr : hour (0 to 23) |
| 73 | + - holiday : weather day is holiday or not (extracted from http://dchr.dc.gov/page/holiday-schedule) |
| 74 | + - weekday : day of the week |
| 75 | + - workingday : if day is neither weekend nor holiday is 1, otherwise is 0. |
| 76 | + + weathersit : |
| 77 | + - 1: Clear, Few clouds, Partly cloudy, Partly cloudy |
| 78 | + - 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist |
| 79 | + - 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds |
| 80 | + - 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog |
| 81 | + - temp : Normalized temperature in Celsius. The values are divided to 41 (max) |
| 82 | + - atemp: Normalized feeling temperature in Celsius. The values are divided to 50 (max) |
| 83 | + - hum: Normalized humidity. The values are divided to 100 (max) |
| 84 | + - windspeed: Normalized wind speed. The values are divided to 67 (max) |
| 85 | + - casual: count of casual users |
| 86 | + - registered: count of registered users |
| 87 | + - cnt: count of total rental bikes including both casual and registered |
| 88 | + |
| 89 | +========================================= |
| 90 | +License |
| 91 | +========================================= |
| 92 | +Use of this dataset in publications must be cited to the following publication: |
| 93 | + |
| 94 | +[1] Fanaee-T, Hadi, and Gama, Joao, "Event labeling combining ensemble detectors and background knowledge", Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg, doi:10.1007/s13748-013-0040-3. |
| 95 | + |
| 96 | +@article{ |
| 97 | + year={2013}, |
| 98 | + issn={2192-6352}, |
| 99 | + journal={Progress in Artificial Intelligence}, |
| 100 | + doi={10.1007/s13748-013-0040-3}, |
| 101 | + title={Event labeling combining ensemble detectors and background knowledge}, |
| 102 | + url={http://dx.doi.org/10.1007/s13748-013-0040-3}, |
| 103 | + publisher={Springer Berlin Heidelberg}, |
| 104 | + keywords={Event labeling; Event detection; Ensemble learning; Background knowledge}, |
| 105 | + author={Fanaee-T, Hadi and Gama, Joao}, |
| 106 | + pages={1-15} |
| 107 | +} |
| 108 | + |
| 109 | +========================================= |
| 110 | +Contact |
| 111 | +========================================= |
| 112 | + |
| 113 | +For further information about this dataset please contact Hadi Fanaee-T (hadi.fanaee@fe.up.pt) |
0 commit comments