This github repository is intended to provide materials (slides, scripts datasets etc) for part of the Statistical Learning course at the UPC-UB MSc in Statistics and Operations Research (MESIO).
This part of the course has an introduction and two blocks, each with two parts.
-
Introduction
-
Tree based methods
1.1 Decision trees
1.2 Ensemble methods
-
Artificial neural networks
2.1 Artificial neural networks
2.2 Introduction to deep learning
All class materials are available from the repository https://aspteaching.github.io/Introduction2StatisticalLearning/.
In this page you will find links to the html/pdf version of the slides and other documents, as well as to datasets or references and resources documents
-
1.1 Introduction: Statistics, Machine Learning, Statistical Learning and Data Science
-
- Regression with KNN
- Classification with KNN
-
Complements
Decision trees are a type of non-parametric classifiers which have been Very successful because of their interpretability, flexibility and a very decent accuracy.
The term "Ensemble" (together in french) refers to distinct approaches to build predictiors by combining multiple models.
They have proved to addres well some limitations of trees therefore improving accuracy and robustness as well as being able to reduce overfitting and capture complex relationships.
Thesea are raditional ML models, inspired in brain, that simulate neuron behavior, thata is they receive an input, which is processed and an output prediction is produced.
For long their applicability has been relatively restricted to a few fields or problems due mainly to their "black box" functioning that made them hard to interpret.
The scenario has now completely changed with the advent of deep neural networks which are in the basis of many powerful applications of artificial intelligence.
Esssentially these are ANN with multiple hidden layers with allow overpassing many of their limitations. They can be tuned in a much more automatical way and have been applied to many complex tasks. such as Computer vision, Natural Language Processing or Recommender systems.
-
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC press.
-
Brandon M. Greenwell (202) Tree-Based Methods for Statistical Learning in R. 1st Edition. Chapman and Hall/CRC DOI: https://doi.org/10.1201/9781003089032 Web site
-
Efron, B., Hastie T. (2016) Computer Age Statistical Inference. Cambridge University Press. Web site
-
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. Springer.
-
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Springer.
-
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning (Vol. 1). MIT press. Web site
-
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
-
Chollet, F. (2018). Deep learning with Python. Manning Publications.
-
Chollet, F. (2023). Deep learning with R . 2nd edition. Manning Publications.