This week's data comes from the Board Game Geek database. The site's database has more than 90,000 games, with crowd-sourced ratings. There is also an R package with the fulldataset (bggAnalysis
) but it hasn't been updated in ~2 years.
To follow along with a fivethirtyeight article, I limited to only games with at least 50 ratings and for games between 1950 and 2016. This still leaves us with 10,532 games!
board_games <- readr::read_csv("")
variable | class | description |
game_id | character | Unique game identifier |
description | character | A paragraph of text describing the game |
image | character | URL image of the game |
max_players | integer | Maximum recommended players |
max_playtime | integer | Maximum recommended playtime (min) |
min_age | integer | Minimum recommended age |
min_players | integer | Minimum recommended players |
min_playtime | integer | Minimum recommended playtime (min) |
name | character | Name of the game |
playing_time | integer | Average playtime |
thumbnail | character | URL thumbnail of the game |
year_published | integer | Year game was published |
artist | character | Artist for game art |
category | character | Categories for the game (separated by commas) |
compilation | character | If part of a multi-compilation - name of compilation |
designer | character | Game designer |
expansion | character | If there is an expansion pack - name of expansion |
family | character | Family of game - equivalent to a publisher |
mechanic | character | Game mechanic - how game is played, separated by comma |
publisher | character | Comoany/person who published the game, separated by comma |
average_rating | double | Average rating on Board Games Geek (1-10) |
users_rated | double | Number of users that rated the game |
named_games <- BoardGames %>%
janitor::clean_names() %>%
set_names(~str_replace(.x, "details_", "")) %>%
set_names(~str_replace(.x, "attributes_boardgame", "")) %>%
set_names(~str_replace(.x, "stats_", "")) %>%
select(game_id:average, usersrated, averageweight) %>%
filter(! %>%
filter(yearpublished <=2016 & yearpublished >= 1950) %>%
filter(usersrated >= 50, game_type == "boardgame")
named_games %>%
group_by(yearpublished) %>%
summarize(count = n()) %>%
ggplot(aes(x = yearpublished, y = count)) +
tidy_names <- c("game_id",
tidy_games <- named_games %>%
set_names(nm = tidy_names) %>%
select(-attributes_total, -game_type, - implementation, -integration, - average_weight)
tidy_games %>%
tidy_games %>% write_csv("board_games.csv")