American Idol data

This week we're exploring American Idol data! This is a comprehensive dataset put together by kkakey.

There's so much data! What do you want to know about American Idol? Song choices, TV ratings, characteristics of winners?

Data in this dataset comes from Wikipedia. Data collected on seasons 1-18 of American Idol.

The Datasets

songs.csv - songs that contestants sang and competed with on American Idol from seasons 1-18

auditions.csv - audition, cities, dates, and venues

elimination_chart.csv - eliminations by week. Data availability varies season-to-season based on season length and number of finalists competing

finalists.csv - information on top contestants, including birthday, hometown, and description

ratings.csv - episode ratings and views.

seasons.csv - season-level information, including season winner, runner-up, release dates, and judges

The Data

# Option 1: tidytuesdayR package 
## install.packages("tidytuesdayR")

tuesdata <- tidytuesdayR::tt_load('2024-07-23')
## OR
tuesdata <- tidytuesdayR::tt_load(2024, week = 30)

auditions <- tuesdata$auditions
eliminations <- tuesdata$eliminations
finalists <- tuesdata$finalists
ratings <- tuesdata$ratings
seasons <- tuesdata$seasons
songs <- tuesdata$songs

# Option 2: Read directly from GitHub

auditions <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-07-23/auditions.csv')
eliminations <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-07-23/eliminations.csv')
finalists <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-07-23/finalists.csv')
ratings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-07-23/ratings.csv')
seasons <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-07-23/seasons.csv')
songs <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-07-23/songs.csv')

How to Participate

Explore the data, watching out for interesting relationships. We would like to emphasize that you should not draw conclusions about causation in the data. There are various moderating variables that affect all data, many of which might not have been captured in these datasets. As such, our suggestion is to use the data provided to practice your data tidying and plotting techniques, and to consider for yourself what nuances might underlie these relationships.
Create a visualization, a model, a shiny app, or some other piece of data-science-related output, using R or another programming language.
Share your output and the code used to generate it on social media with the #TidyTuesday hashtag.

Data Dictionary

`auditions.csv`

variable	class	description
season	double	Season
audition_date_start	double	Start date of audition
audition_date_end	double	End date of audition
audition_city	character	City where audition took place
audition_venue	character	Preliminary location where auditions took place
episodes	character	Episode numbers at this audition location
episode_air_date	character	Date episode aired
callback_venue	character	Filming and callback location where auditions took place
callback_date_start	double	Start date of callback audition
callback_date_end	double	End date of callback audition
tickets_to_hollywood	double	Number of contestants selected from audition to go to Hollywood week
guest_judge	character	Name of guest judge at audition

`eliminations.csv`

variable	class	description
season	double	Season Number
place	character	Place (or place range) contestant finished in competition
gender	character	Gender of contestant
contestant	character	Competitor name
top_36	character	Top 36 eliminations
top_36_2	character	Top 36 eliminations (week 2)
top_36_3	character	Top 36 eliminations (week 3)
top_36_4	character	Top 36 eliminations (week 4)
top_32	character	Top 32 eliminations
top_32_2	character	Top 32 eliminations (week 2)
top_32_3	character	Top 32 eliminations (week 3)
top_32_4	character	Top 32 eliminations (week 4)
top_30	character	Top 30 eliminations
top_30_2	character	Top 30 eliminations (week 2)
top_30_3	character	Top 30 eliminations (week 3)
top_25	character	Top 25 eliminations
top_25_2	character	Top 25 eliminations (week 2)
top_25_3	character	Top 25 eliminations (week 3)
top_24	character	Top 24 eliminations
top_24_2	character	Top 24 eliminations (week 2)
top_24_3	character	Top 24 eliminations (week 3)
top_20	character	Top 20 eliminations
top_20_2	character	Top 20 eliminations (week 2)
top_16	character	Top 16 eliminations
top_14	character	Top 14 eliminations
top_13	character	Top 13 eliminations
top_12	character	Top 12 eliminations
top_11	character	Top 11 eliminations
top_11_2	character	Top 11 eliminations (week 2)
wildcard	character	Wildcard week eliminations
comeback	logical	Comeback week eliminations
top_10	character	Top 10 eliminations
top_9	character	Top 9 eliminations
top_9_2	character	Top 9 eliminations (week 2)
top_8	character	Top 8 eliminations
top_8_2	character	Top 8 eliminations (week 2)
top_7	character	Top 7 eliminations
top_7_2	character	Top 7 eliminations (week 2)
top_6	character	Top 6 eliminations
top_6_2	character	Top 6 eliminations (week 2)
top_5	character	Top 5 eliminations
top_5_2	character	Top 5 eliminations (week 2)
top_4	character	Top 4 eliminations
top_4_2	character	Top 4 eliminations (week 2)
top_3	character	Top 3 eliminations
finale	character	Finale eliminations

`finalists.csv`

variable	class	description
Contestant	character	Name of contestant
Birthday	character	Contestant's birthday
Birthplace	character	Contestant's city of birth
Hometown	character	Contestant's hometown
Description	character	Description of contestant
Season	double	Season

`ratings.csv`

variable	class	description
season	double	Season
show_number	double	Episode number in season
episode	character	Episode name
airdate	character	Date episode aired
18_49_rating_share	character	Percentage of adults aged 18-49 estimated to have watched the episode (Nielsen TV ratings).
viewers_in_millions	double	Number (in millions) that watched the episode
timeslot_et	character	Episode timeslot in Eastern Time
dvr_18_49	character	Percentage of adults aged 18-19 estimated to have watched the episode on DVR
dvr_viewers_millions	character	Number (in millions) that watched the episode on DVR
total_18_49	character	Total percentage of adults aged 18-49 estimated to have watched the episode
total_viewers_millions	character	Total number of viewers (in millions).
weekrank	character	Ranking of episode performance by season
ref	logical	Reference
share	character	share (unused)
nightlyrank	double	Nightly ranking
rating_share_households	character	Ranking per share of households.
rating_share	character	Ratings share.

`seasons.csv`

variable	class	description
season	double	Season
winner	character	Name of winner
runner_up	character	Name of runner_up
original_release	character	Original air dates
original_network	character	Network aired on
hosted_by	character	Host's name
judges	character	Name of judges
no_of_episodes	double	Episode name
finals_venue	character	Venue of finale
mentor	character	Name of season mentor

`songs.csv`

variable	class	description
season	character	Season Number
week	character	Week date and week description
order	double	Order contestants sang in
contestant	character	Competitor name
song	character	Name of song sung
artist	character	Name of song's artist (imputed if not explicitly listed)
song_theme	character	Week theme for songs sung
result	character	Contestant's elimination status for the week

Cleaning Script

# Clean data provided by <https://github.com/kkakey/American_Idol>. No cleaning was necessary.
auditions <- readr::read_csv("https://raw.githubusercontent.com/kkakey/American_Idol/main/metadata/auditions.csv")
eliminations <- readr::read_csv("https://raw.githubusercontent.com/kkakey/American_Idol/main/metadata/elimination_chart.csv")
finalists <- readr::read_csv("https://raw.githubusercontent.com/kkakey/American_Idol/main/metadata/finalists.csv")
ratings <- readr::read_csv("https://raw.githubusercontent.com/kkakey/American_Idol/main/metadata/ratings.csv")
seasons <- readr::read_csv("https://raw.githubusercontent.com/kkakey/American_Idol/main/metadata/seasons.csv")
songs <- readr::read_csv("https://raw.githubusercontent.com/kkakey/American_Idol/main/Songs/songs_all.csv")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

American Idol data

The Data

How to Participate

Data Dictionary

`auditions.csv`

`eliminations.csv`

`finalists.csv`

`ratings.csv`

`seasons.csv`

`songs.csv`

Cleaning Script

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

American Idol data

The Data

How to Participate

Data Dictionary

auditions.csv

eliminations.csv

finalists.csv

ratings.csv

seasons.csv

songs.csv

Cleaning Script

`auditions.csv`

`eliminations.csv`

`finalists.csv`

`ratings.csv`

`seasons.csv`

`songs.csv`