-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugs report and problems #10
Comments
Yes, you are totally right. I should've corrected my codes but I was too busy nowadays. Thank you for pointing the bugs out. |
@Liu0329 As (7) in the original papers, logits should be computed by W^T concat(weighted_context, word_emb, h) ? |
@RutgersHan Have you trained out a model that can be used ? |
@Liu0329 I did not use attention model for image captioning. I used the attention model for another task, For me, the result is not very good. It is only a little better comparing the model without attention. I am not sure whether it is due to my implementation or something else. So that's why I am asking in the previous question, do you think there is bug for the logits computation in this implementation? |
@RutgersHan Yes, according to the paper the logits have 3 parts as you showed. Now I will try the training again. This repo seems unfinished, lots of parts simplified compared with the original code. We can help to improve it. |
@Liu0329 Hello, I am currently reading the code and thanks for the post! For the 'alpha' calculation, aren't we suppose to use the context ('context_encode') and the previous hidden state? but in the implementation, it seems that it's summing up all the previous hidden state? context_encode = context_encode + Thanks! |
@Liu0329 |
@jazzsaxmafia
Have you trained out a good model from your code ?
There might be a bug:
In the function of build_model and build_generator in model_tensorflow.py
h = o * tf.nn.tanh(new_c)
should be replaced byh = o * tf.nn.tanh(c)
Another problem is about context_encode. Is that same with the original code ?
Moreover, I think data should be shuffled for each epoch. The code seems only shuffle the data once.
The text was updated successfully, but these errors were encountered: