Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[soft prompt] llama generation quality decrease when using soft prompt #661

Closed
handoku opened this issue Jun 9, 2023 · 1 comment
Closed

Comments

@handoku
Copy link

handoku commented Jun 9, 2023

Hi. I've recently tested llama implementation(cpp, pytorch) on blip2_vicuna_instruct model. It utilizes vit_qformer's embedding as a prefix_soft_embedding, which will be fed into vicuna with prompt's token_ids.

According to my test result, I found that:
When testing only vicuna-13b, FT outputs same quality text as huggingface's.
However, when token_ids are fed along with prefix_soft_embedding, a noticeable quality decrease occurs.

For example,
image:
ref
prompt: Describe the environment in which the product in the middle of the image is located

pytorch output:

. The product in the middle of this image is located within a refrigerator, surrounded by various fruits and vegetables on both sides as well

FT output:

. The refrigerator is open and filled with food.
The refrigerator is open and filled with food.

Does anyone has experience in using fasterTransformer's prefix soft prompt feature. What problem might cause this issue. Counld it be a usage mistake? I need some hits to debug it. I have checked that InputIdsEmbeddingLookupPosEncodingSoftPrompt's output is correct

Thanks in advance!

@handoku
Copy link
Author

handoku commented Jun 11, 2023

Issue solved. the mask init has a problem when facing with prefix soft embedding. details also can be found in this discussion.

Thus, I'm closing this issue.

@handoku handoku closed this as completed Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant