You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.
I am currently using Python 3.11 and transformers version 4.44.0.
While experimenting with the OPT models (125M, 350M, and 1.5B), I have noticed that the outputs often consist of repetitive and unrelated sentences. I am unsure if I am using the models correctly.
Here is the code I used with the pipeline() function
fromtransformersimportOPTForCausalLMfromtransformersimportGPT2TokenizerFastfromtransformersimportpipelinemodel_name='facebook/opt-1.3b'cache_dir='./models'pretrained_model: OPTForCausalLM=OPTForCausalLM.from_pretrained(model_name, cache_dir=cache_dir)
tokenizer: GPT2TokenizerFast=GPT2TokenizerFast.from_pretrained(model_name, cache_dir=cache_dir)
generator=pipeline(task='text-generation', model=pretrained_model, tokenizer=tokenizer, device=0)
prompt="Paris is the capital"output=generator(prompt, truncation=True, max_length=100)
print(output[0]['generated_text'])
The output I received was:
Paris is the capital of France.
I'm not sure if you're being sarcastic or not, but I'm not sure if you're being serious.
I'm being serious. Paris is the capital of France. It's not the capital of the world.
I'm not sure if you're being sarcastic or not, but I'm not sure if you're being serious.
I'm not sure if you're being sarcastic or not, but I'm not sure if you
I also tried manual generation by inputting embeddings directly (not token ids), as I plan to experiment with prefix-tuning later. However, I still encountered a lot of repetitions:
importtorchfromtransformersimportOPTForCausalLMfromtransformersimportGPT2TokenizerFastfromtransformersimportpipelinemodel_name='facebook/opt-1.3b'cache_dir='./models'pretrained_model: OPTForCausalLM=OPTForCausalLM.from_pretrained(model_name, cache_dir=cache_dir)
tokenizer: GPT2TokenizerFast=GPT2TokenizerFast.from_pretrained(model_name, cache_dir=cache_dir)
prompt="Paris is the capital"max_length=100prompt_ids=torch.LongTensor(tokenizer.encode(prompt))
pretrained_model.eval()
end_id=50260# The 'endoftext' tokenout_token_ids= []
withtorch.no_grad():
prompt_embedding=pretrained_model.get_decoder().embed_tokens(prompt_ids)
input_embedding=prompt_embedding.unsqueeze(0)
# First generation without KV Cacheoutput=pretrained_model(inputs_embeds=input_embedding, use_cache=True)
past_key_values=output.past_key_valuesnext_token_id=output.logits[:, -1, :].argmax(dim=-1).unsqueeze(-1)
out_token_ids.append(next_token_id.item())
# Generation with KV Cacheforiinrange(max_length-1):
output=pretrained_model(input_ids=next_token_id, use_cache=True, past_key_values=past_key_values)
past_key_values=output.past_key_valuesnext_token_id=output.logits[:, -1, :].argmax(dim=-1).unsqueeze(-1)
ifnext_token_id==end_id:
breakout_token_ids.append(next_token_id.item())
text=tokenizer.decode(out_token_ids)
print(prompt+text)
The output is:
Paris is the capital of France, and the city is a cultural and historical treasure. It is also a city that is full of surprises.
The city is full of museums, galleries, and monuments. It is also full of surprises.
The city is full of surprises.
The city is full of surprises.
The city is full of surprises.
The city is full of surprises.
The city is full of surprises.
The city is full of surprises.
Then changing model_name from 1.3b to 125m, the output is a bit like the pipeline version, talking about 'sarcastic':
Paris is the capital of the French Republic.
I'm not sure if you're being sarcastic or not, but I'm not sure if you're being sarcastic.
I'm not sure if you're being sarcastic or not, but I'm not sure if you're being sarcastic.
I'm not sure if you're being sarcastic or not, but I'm not sure if you're being sarcastic.
I'm not sure if you're being sarcastic or not, but I'm not sure if you're being sarcastic
I have a few questions:
Am I using the modle correctly?
Why do the two scripts generate different outputs?
Is it normal to see such repetitions in the generated text?
Does the model output the endoftext token (id 50260) at all?
Thank you for your assistance!
Best regards
The text was updated successfully, but these errors were encountered:
Greetings,
I am currently using Python 3.11 and transformers version 4.44.0.
While experimenting with the OPT models (125M, 350M, and 1.5B), I have noticed that the outputs often consist of repetitive and unrelated sentences. I am unsure if I am using the models correctly.
Here is the code I used with the
pipeline()
functionThe output I received was:
I also tried manual generation by inputting embeddings directly (not token ids), as I plan to experiment with prefix-tuning later. However, I still encountered a lot of repetitions:
The output is:
Then changing
model_name
from 1.3b to 125m, the output is a bit like thepipeline
version, talking about 'sarcastic':I have a few questions:
endoftext
token (id 50260) at all?Thank you for your assistance!
Best regards
The text was updated successfully, but these errors were encountered: