Skip to content

Commit

Permalink
Merge pull request #321 from slds-lmu/add-regu-wd-chunk
Browse files Browse the repository at this point in the history
add in missing chunk on WD and adjust chunk counter
  • Loading branch information
chriskolb authored Nov 26, 2024
2 parents 0a7f9cb + 4c8f4cb commit 68f8f75
Show file tree
Hide file tree
Showing 6 changed files with 25 additions and 10 deletions.
15 changes: 15 additions & 0 deletions content/chapters/15_regularization/15-09-wd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: "Chapter 15.09: Weight decay and L2"
weight: 15009
---
In this section, we show that L2 regularization with gradient descent is equivalent to weight decay and see how weight decay changes the optimization trajectory.

<!--more-->

### Lecture video

{{< video id="xASHDEAWP0U" >}}

### Lecture slides

{{< pdfjs file="https://github.com/slds-lmu/lecture_sl/blob/main/slides-pdf/slides-regu-wd-vs-l2.pdf" >}}
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.09: Geometry of L2 Regularization"
weight: 15009
title: "Chapter 15.10: Geometry of L2 Regularization"
weight: 15010
---
In this section, we provide a geometric understanding of \\(L2\\) regularization, showing how parameters are shrunk according to the eigenvalues of the Hessian of empirical risk, and discuss its correspondence to weight decay.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.10: Geometry of L1 Regularization"
weight: 15010
title: "Chapter 15.11: Geometry of L1 Regularization"
weight: 15011
---
In this section, we provide a geometric understanding of \\(L1\\) regularization and show that it encourages sparsity in the parameter vector.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.11: Early Stopping"
weight: 15011
title: "Chapter 15.12: Early Stopping"
weight: 15012
---
In this section, we introduce early stopping and show how it can act as a regularizer.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.12: Details on Ridge Regression: Deep Dive"
weight: 15012
title: "Chapter 15.13: Details on Ridge Regression: Deep Dive"
weight: 15013
---
In this section, we consider Ridge regression as row-augmentation and as minimizing risk under feature noise. We also discuss the bias-variance tradeoff.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Chapter 15.13: Soft-thresholding and L1 regularization: Deep Dive"
weight: 15013
title: "Chapter 15.14: Soft-thresholding and L1 regularization: Deep Dive"
weight: 15014
---
In this section, we prove the previously stated proposition regarding soft-thresholding and L1 regularization.

Expand Down

0 comments on commit 68f8f75

Please sign in to comment.