-
Notifications
You must be signed in to change notification settings - Fork 697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The performance during training is always the same #95
Comments
how long did u train to get 64% accuracy? |
Sir, I have been training the model for more than 24 hours and the performance did not change. It started at 64% and remained there. |
after 6 to 7 hours of training mine correction rate was 0% .it started at 0 and remained 0 after that much training. i'm training on gpu gtx1060 .any suggestion? |
I have the same issue. Could you fix it? |
decrease your learn rate |
how did you guys change the batch_size. it takes only the first 50 images. ??!! |
@Abduoit around line 265 of train.py is an parameter to train method called batch_size. |
Thanks @WaGjUb I found this line in the train.py
I changed it to this
But I don't think this is correct, any suggestion please, should I leave it as it is ?? |
Yes! I think so because the training try to minimize the loss as you can see around line 175 "train_step = tf.train.AdamOptimizer(learn_rate).minimize(loss)"
I don't think you must do it, but will work as well. You just changed your test size as the same of train batch size. |
I think there is an error with the get_loss function: def get_loss(y, y_): If I understand right "y" are predictions and "y_" labels, so when calling tf.nn.softmax_cross_entropy_with_logits the parameters order should be: tf.nn.softmax_cross_entropy_with_logits( So, get_loss has a bug and the order should be reversed if I am not wrong. Let me know there is a mistake in my reasoning. |
Dear all,
I am trying to train the model on windows 10 (CPU). The problem that I am finding is that the performance doesn't change at all even if the cost change a little. If I rerun the training the performance values change but remain again constant. Here is a snippet:
B7860 64.00% 64.00% loss: 10794.2373046875 (digits: 1051.9578857421875, presence: 9742.279296875) | X X XX X X XXX X X XX X X X XX |
time for 60 batches 324.8394412994385
PV73LEX 0.0 <-> QM69OTK 0.0
KZ48OUS 1.0 <-> QM69OTK 0.0
XF10UGX 0.0 <-> QM69OTK 0.0
HP51SYY 0.0 <-> QM69OTK 0.0
MQ82HOD 0.0 <-> QM69OTK 0.0
YF62RYQ 0.0 <-> QM69OTK 0.0
LE19HIO 0.0 <-> QM69OTK 0.0
XG44DHU 1.0 <-> QM69OTK 0.0
WM08RYQ 0.0 <-> QM69OTK 0.0
TZ23KIA 0.0 <-> QM69OTK 0.0
FB39LOJ 1.0 <-> QM69OTW 0.0
CP55DID 1.0 <-> QM69OTK 0.0
PN26VBI 0.0 <-> QM69OTK 0.0
FO65FUI 0.0 <-> QM69OTK 0.0
OP09YVZ 1.0 <-> QM69OTK 0.0
SK87TTT 0.0 <-> QM69OTK 0.0
EE78HSB 0.0 <-> QM69OTK 0.0
NM15DHP 1.0 <-> QM69OTK 0.0
WY52RKZ 0.0 <-> QM69OTK 0.0
AE21YYQ 0.0 <-> QM39OTK 0.0
AT37NOB 0.0 <-> QM69OTK 0.0
DD97XRW 0.0 <-> QM69OTK 0.0
DV44XSO 0.0 <-> QM69OTK 0.0
EX56ARF 1.0 <-> QM69OTK 0.0
RN63AOR 1.0 <-> QM69OTK 0.0
SQ19HKQ 1.0 <-> QM69OTK 0.0
QL68VPS 0.0 <-> QM69OTK 0.0
UJ87YEA 0.0 <-> QM69OTK 0.0
VN48ULX 1.0 <-> QM69OTK 0.0
DG23BSJ 0.0 <-> QM69OTK 0.0
GD77UFQ 0.0 <-> QM69OTK 0.0
RN27AOA 0.0 <-> QM69OTK 0.0
QX18QPV 0.0 <-> QM69OTK 0.0
KQ35RDE 1.0 <-> QM69OTK 0.0
IF80QMX 0.0 <-> QM69OTK 0.0
CE21AVV 1.0 <-> QM69OTK 0.0
UB26TQZ 1.0 <-> QM69OTK 0.0
EI30JGL 0.0 <-> QM69OTK 0.0
OU28NEY 1.0 <-> QM69OTK 0.0
MN01XZT 0.0 <-> QM69OTK 0.0
WK15APF 0.0 <-> QM69OTK 0.0
SS66HYB 1.0 <-> QM69OTK 0.0
NW44SQL 0.0 <-> QM69OTK 0.0
XI75LCF 0.0 <-> QM69OTK 0.0
IQ93XRG 0.0 <-> QM69OTK 0.0
NJ17XKK 1.0 <-> QM69OTK 0.0
MV55MGF 0.0 <-> QM69OTK 0.0
DK30EQB 1.0 <-> QM69OTK 0.0
WO74RMB 1.0 <-> QM69OTK 0.0
HV08HRX 0.0 <-> QM69OTK 0.0
B7880 64.00% 64.00% loss: 10789.783203125 (digits: 1051.4071044921875, presence: 9738.3759765625) | X X XX X X XXX X X XX X X X XX |
time for 60 batches 319.3657536506653
Please anyone had a similar issue?
Thank you in advance,
Best
The text was updated successfully, but these errors were encountered: