Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base training accuracy(phase 1) #25

Open
amajee11us opened this issue Dec 21, 2020 · 6 comments
Open

Base training accuracy(phase 1) #25

amajee11us opened this issue Dec 21, 2020 · 6 comments

Comments

@amajee11us
Copy link

I was trying to generate the base class accuracy numbers post phase 1 training on VOC dataset.
Unfortunately, I get this error. Is there a way to get the accuracy numbers from phase 1 training.
lib/model/faster_rcnn/faster_rcnn.py:210: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
cls_prob = F.softmax(cls_score)
Traceback (most recent call last):
File "test.py", line 344, in
inds = torch.nonzero(scores[:, j] > thresh).view(-1)
IndexError: index 16 is out of bounds for dimension 1 with size 16

@YoungXIAO13
Copy link
Owner

Hi @amajee11us

This is due to the number of training classes is only 15 for the base class training stage, while the num_class is set to be 20 in the dataset

To work around this issue, you can simply set the num_cls to 15 for testing the accuracy of phase 1.

@amajee11us
Copy link
Author

Thanks, this suggestion works, although I had to update "num_cls" in multiple places within the test script.

image_scores = np.hstack([all_boxes[j][i][:, -1] for j in range(1, num_cls)])

for j in range(1, num_cls):

But, when I inspect the base trained model I see that the final classification weights have dimensions [21, 4096] which should have been the case only during phase 2. We can prove this by setting the num_cls value to 15 here .
When we do this I get this error message from model (save_models/VOC_first/pascal_voc_0712_metarcnn_200_15.pth):

RuntimeError: Error(s) in loading state_dict for resnet:
While copying the parameter named "RCNN_cls_score.weight", whose dimensions in the model are torch.Size([15, 4096]) and whose dimensions in the checkpoint are torch.Size([21, 4096]).
While copying the parameter named "RCNN_cls_score.bias", whose dimensions in the model are torch.Size([15]) and whose dimensions in the checkpoint are torch.Size([21]).
While copying the parameter named "RCNN_bbox_pred.weight", whose dimensions in the model are torch.Size([60, 4096]) and whose dimensions in the checkpoint are torch.Size([84, 4096]).
While copying the parameter named "RCNN_bbox_pred.bias", whose dimensions in the model are torch.Size([60]) and whose dimensions in the checkpoint are torch.Size([84]).
While copying the parameter named "Meta_cls_score.weight", whose dimensions in the model are torch.Size([15, 2048]) and whose dimensions in the checkpoint are torch.Size([21, 2048]).
While copying the parameter named "Meta_cls_score.bias", whose dimensions in the model are torch.Size([15]) and whose dimensions in the checkpoint are torch.Size([21]).

I suspect this to be an issue during model formation in phase 1.

@pengyinxw
Copy link

Hi, I am also wondering why the "RCNN_cls_score.weight" of base trained model has the dimensions [21, 4096]? There are actually 16 classes in base class training stage (15 base classes + 1 background class), so why not [16, 4096]? Please correct me if I have some misunderstanding.

@amajee11us
Copy link
Author

@pengyinxw the model weights of the "RCNN_cls_score" (box classifier) has the dimensions [16, 2048] after the base training. I have confirmed that through my experiments.
The problem however is with the testing script "test.py" which has been written specifically (hardcoded in multiple places) to pick up the complete list of classes including the novel ones. So it expects a weight matrix of dimension [21, 2048]. If you want the inference results for just the base ones please follow the discussion above.
Hope this helps!!

@pengyinxw
Copy link

@amajee11us thanks for your reply! But I am still confused about the "box classifier" dimensions after the base training. Since the output dimensions of "RCNN_cls _score" are imdb.num_classes, while the imdb.num_classes are set to be 21 here even in base training phase.

In addition, instead of using the "test.py" to load model, I just load the base trained model with the simple script attached However, I can only load the base trained model successfully, when I set the num_cls=21.

If possible, could you please tell me how did you confirm the dimensions=16? Thank you in advance
Screen Shot 2021-11-07 at 11 51 20 AM
!

@xiexijun
Copy link

@pengyinxw Hi! Have you reproduced this code? If you reproduce this code, can you tell me how you configured the environment? Looking forward to your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants