Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss output: is this normal? #16

Open
askerlee opened this issue Oct 19, 2017 · 5 comments
Open

loss output: is this normal? #16

askerlee opened this issue Oct 19, 2017 · 5 comments

Comments

@askerlee
Copy link

askerlee commented Oct 19, 2017

Hi, I'm using ResNet101_BN_SCALE_Merged_OHEM on my own dataset. Some of the output losses (loss_bbox and loss_cls) are always 0.

Update: seems there are something wrong with OHEM. When I turn off OHEM everything is normal.

I1019 22:34:34.436921 14581 solver.cpp:229] Iteration 760, loss = 0.0427504
I1019 22:34:34.436954 14581 solver.cpp:245]     Train net output #0: loss_bbox = 0 (* 1 = 0 loss)
I1019 22:34:34.436959 14581 solver.cpp:245]     Train net output #1: loss_cls = 0 (* 1 = 0 loss)
I1019 22:34:34.436962 14581 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.0208707 (* 1 = 0.0208707 loss)
I1019 22:34:34.436965 14581 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.00372629 (* 1 = 0.00372629 loss)

The output with OHEM turned off:

I1020 14:29:00.407395 19371 solver.cpp:245]     Train net output #0: loss_bbox = 0.652186 (* 1 = 0.652186 loss)
I1020 14:29:00.407400 19371 solver.cpp:245]     Train net output #1: loss_cls = 0.654309 (* 1 = 0.654309 loss)
I1020 14:29:00.407404 19371 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.113032 (* 1 = 0.113032 loss)
I1020 14:29:00.407408 19371 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.0568502 (* 1 = 0.0568502 loss)
@whmin
Copy link

whmin commented Nov 8, 2017

@askerlee @Eniac-Xie Could you please release your "ResNet101_BN_SCALE_Merged_OHEM" model files included test.prototxt,my download files not contain it,but i have no time to write it because of an emergency.Thank you very much!!!

@askerlee
Copy link
Author

askerlee commented Nov 9, 2017

Just copy the test.prototxt from the ResNet101_BN_SCALE_Merged folder. They are the same (hard example mining only happens in training, so the test model is the same).

@whmin
Copy link

whmin commented Nov 9, 2017

Ok,thank you!!!Now i got an error when run ./experiments/scripts/faster_rcnn_end2end.sh 0 ResNet-50 pascal_voc,like this:
screenshot from 2017-11-08 21-08-43
i did not change the original code about resnet-50 with ohem,but only replace the "num_classes" and "num_output",i can not solve it,could you help me?

@askerlee
Copy link
Author

askerlee commented Nov 9, 2017

change cls_prob[i,label] to cls_prob[i,int(label)] in lib/roi_data_layer/layer.py:242.

@oysz2016
Copy link

Hi, I have encountered the same problem. Have you solved it?I think it may be the problem of the code. When I used the OHEM code modified by myself to train the author's prototxt file, the loss was not 0, but it was difficult to converge (which did not exist on VGG16).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants