Bugs report and problems #10

Liu0329 · 2016-06-29T03:26:05Z

@jazzsaxmafia
Have you trained out a good model from your code ?

There might be a bug:
In the function of build_model and build_generator in model_tensorflow.py
h = o * tf.nn.tanh(new_c) should be replaced by
h = o * tf.nn.tanh(c)

Another problem is about context_encode. Is that same with the original code ?
Moreover, I think data should be shuffled for each epoch. The code seems only shuffle the data once.

The text was updated successfully, but these errors were encountered:

jazzsaxmafia · 2016-06-29T06:14:35Z

Yes, you are totally right. I should've corrected my codes but I was too busy nowadays.
I will work on it as soon as possible.

Thank you for pointing the bugs out.
-Taeksoo

RutgersHan · 2016-08-11T21:31:37Z

@Liu0329
@jazzsaxmafia
Agree with Liu0329.
Is there another bug in the next line:
logits = tf.matmul(h, self.decode_lstm_W) + self.decode_lstm_b should be replaced by
logits = tf.matmul(h, self.decode_lstm_W) + self.decode_lstm_b
+ tf.matmul(weighted_context, self.decode_lstm_image_W)
+tf.matmul(word_emb, self.decode_lstm_word_W)

As (7) in the original papers, logits should be computed by W^T concat(weighted_context, word_emb, h) ?

Liu0329 · 2016-08-12T08:09:29Z

@RutgersHan Have you trained out a model that can be used ?

RutgersHan · 2016-08-12T22:00:21Z

@Liu0329 I did not use attention model for image captioning. I used the attention model for another task, For me, the result is not very good. It is only a little better comparing the model without attention. I am not sure whether it is due to my implementation or something else. So that's why I am asking in the previous question, do you think there is bug for the logits computation in this implementation?

Liu0329 · 2016-08-20T03:05:02Z

@RutgersHan Yes, according to the paper the logits have 3 parts as you showed. Now I will try the training again. This repo seems unfinished, lots of parts simplified compared with the original code. We can help to improve it.

2g-XzenG · 2016-10-30T19:58:27Z

@Liu0329
@jazzsaxmafia
@RutgersHan

Hello, I am currently reading the code and thanks for the post! For the 'alpha' calculation, aren't we suppose to use the context ('context_encode') and the previous hidden state? but in the implementation, it seems that it's summing up all the previous hidden state?

context_encode = context_encode +
tf.expand_dims(tf.matmul(h, self.hidden_att_W), 1) +
self.pre_att_b

Thanks!

sjksong · 2019-03-03T06:57:49Z

@Liu0329
@jazzsaxmafia
@RutgersHan
@1230pitchanqw
Hello,I'm a undergraduate from Communication University Of China ,I'm currently working on this area for my graduation paper . Could you please tell me if you have worked out a satisfactory result based on this repo ? if it is ,how many epoch did you use to train out a good model? and what else did you change the original code?
I would appreciate it if you can give me a reply ,thank you very much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugs report and problems #10

Bugs report and problems #10

Liu0329 commented Jun 29, 2016 •

edited

Loading

jazzsaxmafia commented Jun 29, 2016

RutgersHan commented Aug 11, 2016 •

edited

Loading

Liu0329 commented Aug 12, 2016

RutgersHan commented Aug 12, 2016

Liu0329 commented Aug 20, 2016

2g-XzenG commented Oct 30, 2016

sjksong commented Mar 3, 2019

Bugs report and problems #10

Bugs report and problems #10

Comments

Liu0329 commented Jun 29, 2016 • edited Loading

jazzsaxmafia commented Jun 29, 2016

RutgersHan commented Aug 11, 2016 • edited Loading

Liu0329 commented Aug 12, 2016

RutgersHan commented Aug 12, 2016

Liu0329 commented Aug 20, 2016

2g-XzenG commented Oct 30, 2016

sjksong commented Mar 3, 2019

Liu0329 commented Jun 29, 2016 •

edited

Loading

RutgersHan commented Aug 11, 2016 •

edited

Loading