You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In your paper, the rewarder network is modeled a simple feed-forward neural network. When I try to understand it thought this code, I found that it was modeled a LSTM. The value of reward comes from the prediction of LSTM network each time. Why ?
The text was updated successfully, but these errors were encountered:
In your paper, the rewarder network is modeled a simple feed-forward neural network. When I try to understand it thought this code, I found that it was modeled a LSTM. The value of reward comes from the prediction of LSTM network each time. Why ?
The text was updated successfully, but these errors were encountered: