Unit 6.3 #91

regina-grg · 2023-09-13T09:28:26Z

regina-grg
Sep 13, 2023

1. Unit 6.3 - SGD with Momentum in Part One:
I'm struggling to understand the algorithm presented for SGD with momentum. From what I've gathered, it appears as though the gradients are being overridden with the previous update in each step, rather than being accumulated. If that's the case, is there a need to recalculate the gradients in each iteration, since they are being replaced anyway?

2. Unit 6.3 - Part Two on RMSprop and Adam:
I noticed that the update equations for both RMSprop and Adam are given as $( w_t \leftarrow w_t - \alpha(...) $). Shouldn't it be $( w_t \leftarrow w_{t-1} - \alpha(...) $) since we are updating based on the previous weights?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unit 6.3 #91

{{title}}

Replies: 0 comments

Select a reply

Unit 6.3 #91

regina-grg Sep 13, 2023

Replies: 0 comments

regina-grg
Sep 13, 2023