In this project, we explore how IMDb ratings for horror TV Series change as shows progress through their seasons. Our analyses delve into the trends of ratings over time, comparing different TV Series in the Horror Genre.
We explore the volume of Ratings fluctuating with each season, providing insights on these series' popularity and viewer engagement. This analysis is valuable for fans, critics, and industry professionals looking to understand how the reception of horror TV series develops over time. By highlighting these trends, we offer actionable information to help producers decide about TV series production and viewer engagement strategies.
Do Horror TV series episodes tend to receive higher or lower IMDb ratings as a show progresses through its seasons?
The research method includes several steps. First, a descriptive statistics analysis is performed. This is crucial as it gives simple insights into the data for any patterns and/or trends. It is helpful for the research as it allows us to have a quick overview of how the ratings change over the seasons of TV series. We will calculate the average ratings per season.
Second, our approach combines regression analysis to model the relationship between ratings and seasons with time series analysis to observe trends and patterns over time. This integrated methodology provides a comprehensive view of how the reception of horror TV series develops, offering valuable insights for fans, critics, and industry professionals. Understanding these trends can guide decisions related to TV series production and viewer engagement strategies.
- Regression Analysis: We use regression techniques to model and predict how ratings vary with each new season of horror TV series.
- Time Series Analysis: We apply time series methods to examine temporal trends and patterns in the ratings over time, highlighting how viewer engagement changes with the progression of seasons.
Listed below are the variables needed specifically for this research and their description:
The table below summarizes all the variables used in our analysis, detailing their source datasets and descriptions.
Dataset | Variable Name | Description |
---|---|---|
title.episode.tsv.gz | tconst |
Unique identifier for the episode |
parentTconst |
Identifier for the parent TV series | |
seasonNumber |
Season number of the episode | |
episodeNumber |
Episode number within the season | |
title.ratings.tsv.gz | tconst |
Unique identifier for the title |
averageRating |
Weighted average of all the individual user ratings | |
numVotes |
Number of votes the title has received | |
title.basics.tsv.gz | tconst |
Unique identifier for the title |
titleType |
Type of title (e.g., movie, short, tvseries) | |
primaryTitle |
Primary title used for promotional materials | |
originalTitle |
Original title in the original language | |
startYear |
Release year of the title or TV series start year | |
endYear |
End year of TV series or '\N' for non-series titles | |
runtimeMinutes |
Primary runtime of the title, in minutes | |
genres |
Array of genres associated with the title |
To get the datasets, you can follow the following links:
- url_episodes <- "https://datasets.imdbws.com/title.episode.tsv.gz"
- url_ratings <- "https://datasets.imdbws.com/title.ratings.tsv.gz"
- url_basics <- "https://datasets.imdbws.com/title.basics.tsv.gz"
The workflow of the repository consists of four parts.
1. Loading the data To start, the data is downloaded, loaded, and merged into one file. On this file, the exploration, preparation, and analysis steps are conducted.
2. Data Exploration After the data is collected and merged into one file, exploratory data analysis is conducted to get a first grasp of the data at hand. The next steps are formulated based on the findings from this step.
3. Data Preparation Given the insights achieved from the data exploration step, the dataset is prepared for analysis. NAs are handled, values are formatted appropriately and the data is sorted and filtered in preparation for the analysis.
4. Data analysis Multiple analyses are conducted over the data. Statistical calculations will be described. Regression plots will be created to visualize the outcome of the analyses. Generated plots will be saved into the '/gen' folder.
- R. Installation guide.
- Make. Installation guide.
- To knit RMarkdown documents, make sure you have installed Pandoc using the installation guide on their website.
install.packages("dplyr")
install.packages("tidyr")
install.packages("readr")
install.packages("ggplot2")
install.packages("lubridate")
install.packages("rmarkdown")
install.packages("knitr")
install.packages("stringr")
install.packages("purrr")
install.packages("broom")
install.packages("vroom")
Team 6:
- Maria Yolovska, email: [email protected]
- Nicole Nikolova, email: [email protected]
- Noah Bouwhuis, email: [email protected]