"All models are wrong, but some are useful." - George E.P. Box
This research intends to:
- Introduce Dean Oliver's "Four Factors of Basketball Success in the NBA" and test if the four factors are statistically significant predictors of success in the NBA
- Use the four factors to predict a team's number of wins and average margin of victory in the NBA using historical data and various machine learning models
- Compare the observed weightings of the different factors in our models to those proposed by Dean Oliver and Ed Küpfer.
- thesis_draft.pdf
- thesis_draft.zip
- main.tex
- images
- Modeling Oliver’s Four Factors.pdf
- R_four_factors.Rmd
- basketball_reference_scraping.ipynb
- historical_changes_visualizations.ipynb
- modeling.ipynb
- per_100_posessions_historical.xlsx
- four_factors_20xx_to_20xx.xlsx
- four_factors_all_seasons.xlsx
Multiple Linear Regression
Random Forest
Gradient Boosting
Neural Network (future work)
corrplot
formattable
GGally
ggplot2
kableExtra
knitr
plotly
scales
stargazer
viridis
An Introduction to Statistical Learning: with Applications in R
- Resource for understanding statistical learning models
The Elements of Statistical Learning
- Resource for understanding statistical learning models in greater depth
- Resource for obtaining NBA data