Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to replicate sp_v6 and descriptor loss #288

Closed
martinarroyo opened this issue Jan 20, 2023 · 9 comments
Closed

Trying to replicate sp_v6 and descriptor loss #288

martinarroyo opened this issue Jan 20, 2023 · 9 comments

Comments

@martinarroyo
Copy link

[This is somewhat related to #287, but I'll open another issue so as not to pollute the other issue]

Hi @rpautrat, thanks for this work and also for providing support for the repo! I am trying to reproduce the results that you report in the README on HPatches as a first step towards making some changes to the model. In order to save some time, I labeled the COCO dataset using the pretrained model listed in the README (MagicPoint (COCO)) and launched a training with the superpoint_coco.yaml config in its current state at HEAD. I had to make minor modifications to the codebase to get it to work in my infra (mostly I/O) but there should be no changes that affect training. I noticed that the negative and positive distances as reported in TensorBoard oscillate within a very small range of values (~[1e-5-1e-7]) and this got me worried. This seems strange based on the values reported in #277 (comment). For reference, here is how it looks on my current training (my machine restarted so the graphs look a bit funny, apologies for that):

image

Precision and recall are also much lower than those in the sp_v6 log, where recall goes up to 0.6, I can only get it to ~0.37.

I looked into the yaml file in the sp_v6 tarfile and noticed that the loss weights seemed to be adapted for the 'unnormalized' descriptors, so I reverted the changes introduced in 95d1cfd. This helps with the distances (the values are ~0.03 and 0.02 for positive and negative):

image

But recall is still quite low (~0.38).

I also evaluated the model with normalization on HPatches and the results look reasonable. For comparison, I also loaded the sp_v6 checkpoint and ran the same evaluation:

Viewpoint changes Illumination changes
Mine sp_v6 ckpt Claimed Mine sp_v6 ckpt Claimed
0.613 0.645 0.674 0.630 0.655 0.662
Illumination Viewpoint All
Mine sp_v6 ckpt Claimed Mine sp_v6 ckpt Claimed Mine sp_v6 ckpt Claimed
e=1 0.460 0.477 0.483
e=3 0.940 0.936 0.965 0.650 0.654 0.712 0.793 0.793 0.836
e=5 0.894 0.881 0.91

The change in quality is not too bad, but I am still concerned that the numbers are always slightly below the ones reported in the README of this repository, so I would like to ask the following:

  • What version of the code was used to train the sp_v6 model? I am assuming it was trained on 2 GPUs using COCO data labeled with this checkpoint, but please correct me if I am wrong.
  • What model was used to compute the metric values claimed in the README?
  • Have you experienced the same behaviour with the positive and negative distances in the descriptor loss?

Thanks a lot in advance for your help, much appreciated!

@rpautrat
Copy link
Owner

Hi, I can try to help you replicate the original results, but up to a limit only. This work is indeed more than 4 years old, and I don't remember all the details of the experiments anymore. But regarding your three points:

  • The sp_v6 model was indeed trained on 2 GPUs on the COCO dataset, but with pseudo labels generated by this model.
  • I am not 100% sure, but I think to remember that the metrics of the Readme were computed with the sp_v6 model. What is quite likely is that the metrics themselves changed in the last 4 years, explaining the gap between the claimed values and the new ones you computed. If you also evaluate the original SuperPoint of Magic Leap, you might observe a similar difference.
  • I think the positive and negative distances are rather noisy and hard to interpret. I don't remember the trend during my trainings, but you can check the tensorboard log in the zip file of the pre-trained models to compare with your own values.

I hope this can be of some help to you.

@martinarroyo
Copy link
Author

Hi, thanks for the quick reply. I mixed up the links in my message and meant indeed the one pointing to mp_synth-v11_ha1_trained, sorry for the confusion.

I compared the logs for the distances and the rest of the metrics. They look similar, however, I had not noticed before that the detector loss is actually much higher than for sp_v6 (green is my experiment, orange sp_v6). Could this mean that I need to tune $\lambda$?
image.

I'll try to also evaluate the Magic Leap SP implementation to see if the discrepancy is similar.

I think this more or less answers my questions. If you could comment on the discrepancy of the detector loss that would be great. I will close the issue once I run the Magic Leap model.

@rpautrat
Copy link
Owner

Indeed, you may want to tune $\lambda$ to better balance the descriptor and detector losses. The latter seems a bit too high in comparison.

@ericzzj1989
Copy link

Since issue is somewhat related to my issue #287 (comment), would you @martinarroyo please mind explaining and describing what changes you have made in the code for your results?

@martinarroyo
Copy link
Author

Apologies for the belated response. My changes were minimal, I only made some modifications to the I/O logic so that it would work in my infrastructure as well as fixing some imports that were not working on my setup. The training logic was unaltered.

@shreyasr-upenn
Copy link

HI @rpautrat , I have been getting negative distances as zero in every step. Is this normal? What might have occurred? Same hyperparameter values as you have used.

@rpautrat
Copy link
Owner

rpautrat commented Mar 5, 2024

Hi, having a zero negative loss is not impossible, but surprising. If you look at its definition here:

negative_dist = tf.maximum(0., dot_product_desc - config['negative_margin'])
, it is obtained as a hinge loss max(0, desc_distance - m). So if the desc_distance is lower than the margin m (set to 0.2 by default), the negative loss becomes 0, which means that the model was perfectly able to distinguish different descriptors.

However, getting 0 at every steps seems a bit fishy and to good to be true. I would expect it to be positive for at least a few samples. Maybe you can try to plot a few values in the link above to understand what is happening. Checking the positive loss would also be interesting.

@shreyasr-upenn
Copy link

shreyasr-upenn commented Mar 6, 2024

image
image
I have done the equivalent of this code block in Pytorch:

tf.summary.scalar('positive_dist', tf.reduce_sum(valid_mask * config['lambda_d'] *

` positive_sum = torch.sum(valid_masklambda_ds*positive_dist) / valid_mask_norm

negative_sum = torch.sum(valid_mask*(1-s)*negative_dist) / valid_mask_norm`

@rpautrat
Copy link
Owner

rpautrat commented Mar 6, 2024

I would suggest printing a few values to debug your code and understand why the negative loss becomes zero in your case. This sounds to good to be true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants