Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add in missing chunk on WD and adjust chunk counter #321

Merged
merged 2 commits into from
Nov 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions content/chapters/15_regularization/15-09-wd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: "Chapter 15.09: Weight decay and L2"
weight: 15009
---
In this section, we show that L2 regularization with gradient descent is equivalent to weight decay and see how weight decay changes the optimization trajectory.

<!--more-->

### Lecture video

{{< video id="xASHDEAWP0U" >}}

### Lecture slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_sl/blob/main/slides-pdf/slides-regu-wd-vs-l2.pdf" >}}
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.09: Geometry of L2 Regularization"
weight: 15009
title: "Chapter 15.10: Geometry of L2 Regularization"
weight: 15010
---
In this section, we provide a geometric understanding of \\(L2\\) regularization, showing how parameters are shrunk according to the eigenvalues of the Hessian of empirical risk, and discuss its correspondence to weight decay.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.10: Geometry of L1 Regularization"
weight: 15010
title: "Chapter 15.11: Geometry of L1 Regularization"
weight: 15011
---
In this section, we provide a geometric understanding of \\(L1\\) regularization and show that it encourages sparsity in the parameter vector.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.11: Early Stopping"
weight: 15011
title: "Chapter 15.12: Early Stopping"
weight: 15012
---
In this section, we introduce early stopping and show how it can act as a regularizer.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.12: Details on Ridge Regression: Deep Dive"
weight: 15012
title: "Chapter 15.13: Details on Ridge Regression: Deep Dive"
weight: 15013
---
In this section, we consider Ridge regression as row-augmentation and as minimizing risk under feature noise. We also discuss the bias-variance tradeoff.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.13: Soft-thresholding and L1 regularization: Deep Dive"
weight: 15013
title: "Chapter 15.14: Soft-thresholding and L1 regularization: Deep Dive"
weight: 15014
---
In this section, we prove the previously stated proposition regarding soft-thresholding and L1 regularization.

Expand Down