Unit 6.3 #91
Unanswered
regina-grg
asked this question in
Lecture videos and quizzes
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
1. Unit 6.3 - SGD with Momentum in Part One:
I'm struggling to understand the algorithm presented for SGD with momentum. From what I've gathered, it appears as though the gradients are being overridden with the previous update in each step, rather than being accumulated. If that's the case, is there a need to recalculate the gradients in each iteration, since they are being replaced anyway?
2. Unit 6.3 - Part Two on RMSprop and Adam:$( w_t \leftarrow w_t - \alpha(...) $ ). Shouldn't it be $( w_t \leftarrow w_{t-1} - \alpha(...) $ ) since we are updating based on the previous weights?
I noticed that the update equations for both RMSprop and Adam are given as
Beta Was this translation helpful? Give feedback.
All reactions