You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's not clear to me why he uses the t=1 hidden state again: shouldn't it be hiddens[0,:] since that gradient is sensitive to the previous hidden state?
Thanks in advance to whomever may help me.
The text was updated successfully, but these errors were encountered:
During packpropagation, at t=2, when he calculates the gradient of the h, he writes this:
h_weight_grad += hiddens[1,:][:,np.newaxis] @ h2_grad
Taking into account the hidden state at t=1. But later, when he backprops at t=1, he writes:
h_weight_grad += hiddens[1,:][:,np.newaxis] @ h1_grad
It's not clear to me why he uses the t=1 hidden state again: shouldn't it be hiddens[0,:] since that gradient is sensitive to the previous hidden state?
Thanks in advance to whomever may help me.
The text was updated successfully, but these errors were encountered: