6.9 Selecting the best model

Notes

We select the final model from decision tree, random forest, or xgboost based on the best auc scores. After that we prepare the df_full_train and df_test to train and evaluate the final model. If there is not much difference between model auc scores on the train as well as test data then the model has generalized the patterns well enough.

Generally, XGBoost models perform better on tabular data than other machine learning models but the downside is that these model are easy to overfit cause of the high number of hyperparameter. Therefore, XGBoost models require a lot more attention for parameters tuning to optimize them.

Add notes from the video (PRs are welcome)

⚠️	The notes are written by the community. If you see an error here, please create a PR with a fix.

Navigation

Machine Learning Zoomcamp course
Session 6: Decision Trees and Ensemble Learning
Previous: XGBoost parameter tuning
Next: Summary

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

09-final-model.md

09-final-model.md

6.9 Selecting the best model

Notes

Navigation

Files

09-final-model.md

Latest commit

History

09-final-model.md

File metadata and controls

6.9 Selecting the best model

Notes

Navigation