Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VQA: Limitations in questions and answers #25

Open
fizahkhalid opened this issue Feb 18, 2023 · 2 comments
Open

VQA: Limitations in questions and answers #25

fizahkhalid opened this issue Feb 18, 2023 · 2 comments

Comments

@fizahkhalid
Copy link

I want my Finetuned VQA model to be able to answer questions is was not trained on before and similarly provides answers that does not exist in the original answer list (test json file answers in a list).

Is there a limitation to the kind of questions i can the model? If yes, how can I tweak the code to meet my needs?

@zengyan-97
Copy link
Owner

Hi,

you need to modify the inference process of the VQA model.

do not use this to rank the candidate answers: https://github.com/zengyan-97/X-VLM/blob/master/models/model_vqa.py#L144

instead, you should make it a real generation. for example, you can refer to: https://github.com/zengyan-97/X-VLM/blob/master/models/model_captioning.py#L75

@chilljudaoren
Copy link

Traceback (most recent call last):
File "/home/czh/.pycharm_helpers/pydev/pydevd.py", line 1534, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/czh/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/czh/BLIP-main/Eval_VQA.py", line 261, in
main(args, config)
File "/home/czh/BLIP-main/Eval_VQA.py", line 180, in main
model.load_pretrained(ckpt, config, is_eval=True)
File "/home/czh/BLIP-main/models/XVLM/model_vqa.py", line 89, in load_pretrained
msg = self.load_state_dict(state_dict, strict=False)
File "/home/czh/.conda/envs/czh/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2215, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for XVLM:
size mismatch for vision_encoder.layers.0.blocks.0.attn.relative_position_bias_table: copying a param with shape torch.Size([841, 4]) from checkpoint, the shape in current model is torch.Size([529, 4]).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants