Skip to content

Commit

Permalink
ic nested resampling (#1160)
Browse files Browse the repository at this point in the history
* created ic_nested_resampling with pdf

* update to include date

* update recap on nested resampling

---------
  • Loading branch information
manuelhelmerichs authored Nov 14, 2023
1 parent a3e4dc8 commit fe203ae
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 0 deletions.
Binary file added exercises-pdf/ic_nested_resampling.pdf
Binary file not shown.
18 changes: 18 additions & 0 deletions exercises/nested-resampling/ex_rnw/ex_recap_nested_resampling.Rnw
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Assume we have a dataset $\D = \Dset$ with $n$ observations of a continuous target variable $y$ and $p$ features $x_1, \ldots, x_p$. We want to build a prediction model that can be deployed and we want to estimate the corresponding generalization error. For this, we build a graph learner that consists of a neural network in one arm and a random forest in the other arm. The neural network shall have one hyperparameter, the number of hidden layers; assume the number of nodes per hidden layer and all other possible hyperparameters are fixed. The random forest shall have two hyperparameters, the maximal depth and the number of trees; assume that all other possible hyperparameters are fixed. In total, we pursue three goals (not necessarily in this order):
\begin{itemize}
\item[A)] Train a final model $\hat{f}$ that can be deployed.
\item[B)] Tune the graph learner.
\item[C)] Estimate the generalization error.
\end{itemize}

Answer the following questions:
\begin{itemize}
\item[1)] For each goal:
\begin{itemize}
\item[a)] Do we need resampling, nested resampling, or no resampling?
\item[b)] Which fraction of the available dataset can be used?
\end{itemize}
\item[2)] In which order (e.g., "A-B-C") can the three goals be tackled?
\item[3)] Write down a pseudo-algorithm for carrying out all three steps (in a sensible order as derived in 2))
\item[4)] Assume the number of hidden layers is $\in{\{1,2,3,4,5\}}$, the number of trees is $\in{\{10,50,100,200\}}$ and the maximal depth is $\in{\{2,3,4,5\}}$. Use 3-fold cross-validation as outer resampling and 4-fold cross-validaion as inner resampling. Compute the total number of model trainings carried out in 3).
\end{itemize}
21 changes: 21 additions & 0 deletions exercises/nested-resampling/ic_nested_resampling.Rnw
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
% !Rnw weave = knitr

<<setup-child, include = FALSE>>=
library('knitr')
knitr::set_parent("../../style/preamble_ueb.Rnw")
@

\input{../../latex-math/basic-math.tex}
\input{../../latex-math/basic-ml.tex}
\input{../../latex-math/ml-ensembles.tex}
\input{../../latex-math/ml-hpo.tex}
\input{../../latex-math/ml-eval.tex}


\kopfic{}{Nested Resampling}


\aufgabe{Recap Nested Resampling}{
<<child="ex_rnw/ex_recap_nested_resampling.Rnw">>=
@
}

0 comments on commit fe203ae

Please sign in to comment.