Analysis-ready dataset of population, employment, and travel times in Toronto #702

paezha · 2024-06-23T12:40:58Z

Dataset name: TTS2016R
Dataset download URL: https://soukhova.github.io/TTS2016R/
Article that demonstrates the dataset: https://doi.org/10.1177/23998083241242844
Cleaning script: The data are analysis-ready.

Data dictionary: All variables are documented in the package.

jonthegeek · 2024-06-28T12:55:04Z

@paezha The DOI is the article from #701. I see in the package that it's supposed to be https://doi.org/10.1177/23998083221146781, though. Thanks!

lgibson7 · 2024-08-10T21:52:15Z

I can download the dataset from the link provided.
The dataset will (probably) be less than 50MB when saved as a tidy CSV.
There is a link to an article that has something to do with the dataset.
I can imagine a data visualization related to this dataset.
This dataset has not already been used in TidyTuesday.
ALT text is provided for all (both) images.
There is a data dictionary describing the columns of the dataset.
The TidyTuesday maintainers are unlikely to get sued for using the dataset.

lgibson7 · 2024-08-12T04:56:43Z

Hi @paezha. Thanks for submitting this issue. Would you be willing to submit the data set through a PR? You can find the instructions on how to do so here.

paezha · 2024-08-14T10:57:02Z

Hi @lgibson7 - Happy to submit the dataset. It is already an R package, though, so I am unsure how many, if any, of the steps outlined here are needed. For example, the data files are already clean and saved in native R format.

jonthegeek · 2024-08-14T13:06:19Z

@paezha Regardless of the source, we share the datasets as one or more CSVs. When the data comes from a package, the cleaning script will likely be very short, along the lines of this:

# Clean data from pkgname (https://pkgurl)
toronto_population <- pkgname::toronto_population
toronto_employment <- pkgname::toronto_employment
toronto_travel <- pkgname::toronto_travel

It's very similar to situations where the data is cleanly available as CSVs, such as the recent American Idol dataset.

The cleaning might also be more complicated, to take a subset of the data or otherwise make it more CSV-friendly, such as what I did to share our own data from our ttmeta package.

I hope that helps explain the process!

paezha added the dataset label Jun 23, 2024

lgibson7 added the requested PR label Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis-ready dataset of population, employment, and travel times in Toronto #702

Analysis-ready dataset of population, employment, and travel times in Toronto #702

paezha commented Jun 23, 2024

jonthegeek commented Jun 28, 2024

lgibson7 commented Aug 10, 2024 •

edited

Loading

lgibson7 commented Aug 12, 2024

paezha commented Aug 14, 2024

jonthegeek commented Aug 14, 2024

Analysis-ready dataset of population, employment, and travel times in Toronto #702

Analysis-ready dataset of population, employment, and travel times in Toronto #702

Comments

paezha commented Jun 23, 2024

jonthegeek commented Jun 28, 2024

lgibson7 commented Aug 10, 2024 • edited Loading

lgibson7 commented Aug 12, 2024

paezha commented Aug 14, 2024

jonthegeek commented Aug 14, 2024

lgibson7 commented Aug 10, 2024 •

edited

Loading