Skip to content

Commit

Permalink
update some links
Browse files Browse the repository at this point in the history
  • Loading branch information
linhsolar committed Jan 9, 2025
1 parent b4df8fd commit 1efd774
Showing 1 changed file with 13 additions and 1 deletion.
14 changes: 13 additions & 1 deletion data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,36 @@ There are many open datasets that you can download for practicing activities in
- [NYC Taxi Dataset](https://data.cityofnewyork.us/Transportation/2018-Yellow-Taxi-Trip-Data/t29m-gskq)
- For multiple monthly datasets: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
- [Chicago Taxi Dataset](https://data.cityofchicago.org/Transportation/Taxi-Trips/wrvz-psew)

* Animals and Wildlife monitoring
- [Tortoise monitoring in Korkeasaari ZOO](https://iot.fvh.fi/downloads/tortoise/)
- [Avian Vocalizations from CA & NV, USA](https://www.kaggle.com/samhiatt/xenocanto-avian-vocalizations-canv-usa)

* Social networks/Multimedia
- [Reddit comments](https://www.kaggle.com/datasets/kaggle/reddit-comments-may-2015)

* Marketplace
- [Airbnb Dataset](http://insideairbnb.com/get-the-data.html)
- [Amazon Customer Review Dataset](https://www.kaggle.com/cynthiarempel/amazon-us-customer-reviews-dataset)

* Environment Monitoring/Smart City
- [Open data about air quality monitoring from Germany](https://github.com/opendata-stuttgart/meta/wiki/EN-APIs)
- [UK Open Government Water Quality](https://environment.data.gov.uk/water-quality/view/landing)

* System/Datacenter operation traces/logs:
- [Microsoft Azure traces dataset](https://github.com/Azure/AzurePublicDataset)
- [Alibaba Cluster Traces](https://github.com/alibaba/clusterdata)

* Civil engineering/architect/construction data:
- [2015 LiDAR Flight 150326_131440 for Dublin City](https://archive.nyu.edu/handle/2451/38660)

* Cybersecurity:
- https://research.unsw.edu.au/projects/unsw-nb15-dataset
- https://www.unb.ca/cic/datasets/ids-2018.html


* Industry/Manufacturing:
* [MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection](https://zenodo.org/records/3384388)
* [Dataset of Leak Simulations in Experimental Testbed Water Distribution System](https://data.mendeley.com/datasets/tbrnp6vrnj/1)
* [Open data sets in Google Big Query](https://cloud.google.com/bigquery/public-data)

When using these datasets, you need to comply with their corresponding licenses.
Expand All @@ -49,3 +59,5 @@ The datasets are stored within this directory and provided by [Linh Truong](http
## Your own datasets

You can also propose your own dataset for your assignment but you must discuss with the lecturer first.


0 comments on commit 1efd774

Please sign in to comment.