Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online BootsTAPIR Weights and Config File #101

Open
gorkaydemir opened this issue Jul 6, 2024 · 9 comments
Open

Online BootsTAPIR Weights and Config File #101

gorkaydemir opened this issue Jul 6, 2024 · 9 comments

Comments

@gorkaydemir
Copy link

Hi,
Thanks for your great work, BootsTAP.
It seems like shared PyTorch model weights of Online BootsTAPIR is not compatible with TAPIR model, in terms of extra_convs hidden sizes.
Moreover, can you share the config files of Online BootsTAPIR? Both Online and Offline BootsTAPIR config files are same at the moment.

Thanks in advance.

@cdoersch
Copy link
Collaborator

cdoersch commented Jul 8, 2024

We now have a colab which demonstrates how to use it:

https://github.com/google-deepmind/tapnet/blob/main/colabs/torch_causal_tapir_demo.ipynb

@bhack
Copy link

bhack commented Jul 9, 2024

@cdoersch What is the the main rational of this comment? Do you meant integrating over the pyramids? Do you have an example?

  # Take only the predictions for the final resolution.
  # For running on higher resolution, it's typically better to average across
  # resolutions.
  tracks = trajectories['tracks'][-1]
  occlusions = trajectories['occlusion'][-1]
  uncertainty = trajectories['expected_dist'][-1]

@bhack
Copy link

bhack commented Jul 9, 2024

I have also another question.
In a quite frequent case where we want to start from an arbitrary frame and so we need to process forward and backward to cover the sequence do we need to call both online_model_init and model.construct_initial_causal_state every time we change direction?

@gorkaydemir
Copy link
Author

Hi,
Thank you for the demonstration and for sharing the notebook!
I have a couple of questions: Did you use this approach while evaluating the online models on the DAVIS or Kinetics datasets, as referenced in Table 7 of the paper? Unfortunately, I wasn't able to reproduce the results using the provided torch model and checkpoint. Could you offer any guidance on this?

Thank you

@cdoersch
Copy link
Collaborator

@bhack I don't see what this comment has to do with the current thread, but I'll answer anyway. During training, we apply the loss to the prediction at every layer. Therefore the model returns the final prediction as the output, and "unrefined" predictions for every iteration at every resolution. At test time, however, we find the best accuracy by taking the final refinement prediction averaged across resolutions, so that's what we return by default.

@bhack you can extract query features from later frames and then track them starting from the beginning of the video. However, it may be slightly more accurate to do it forward and backward in time, as you suggest, in which case you would need to call model.construct_initial_causal_state twice. In the current model, online_model_init can be re-used across both forward/backward runs since it only depends on the query frame.

@gorkaydemir could you provide some more information on how you're running the model, preferably the code that you're using? We use jax internally, so the pytorch port is not as well tested. You'll have to provide a way to reproduce the issue.

@cdoersch
Copy link
Collaborator

@sgjheywa FYI

@bhack
Copy link

bhack commented Jul 11, 2024

@cdoersch I've posted here as that comment is in the just released pytorch online notebook and we have already also more than one ticket asking for pytorch online version and checkpoint.
So it was just to not open a new one related to the new online notebook.

@gorkaydemir
Copy link
Author

Hi @cdoersch,
Using the evaluator and data from CoTracker, which incorporates much of your code for the TAPVid classes, I evaluated both Causal BootsTAPIR and Default BootsTAPIR using the queried-first approach. You can view the code here: Colab Notebook.

I minimized custom code and primarily utilized the scripts you provided in the notebooks. The results are reported in the last cell as a comment. While the offline BootsTAPIR performance is close to the reported values, there is a notable discrepancy between the expected and reproduced online model performances.

Thank you in advance for your assistance.

@yangyi02
Copy link
Collaborator

We are investigating this issue now. Thanks for your "patience". Apologize for the long waiting time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants