Pythia 160M is giving unreasonable logit values #177

danielmisrael · 2024-10-14T03:12:36Z

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained("EleutherAI/pythia-160m")
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-160m")
input_ids = tokenizer.encode("Hello, my dog is cute", return_tensors="pt")
model.eval()
with torch.no_grad():   
     logits = model(input_ids).logits
print(logits)
print(torch.topk(logits, k = 5))`

This is my code and the output is

For no other model do the logit values get this large. The 410m model has maximum values of ~10. I was wondering if there is a bug in the way logits are computed?

The text was updated successfully, but these errors were encountered:

Tr1ple-F · 2024-12-07T21:54:27Z

I can confirm the same, both 70m and 160m seem to have strange predictions in the final steps, particularly around
<|endoftext|>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pythia 160M is giving unreasonable logit values #177

Pythia 160M is giving unreasonable logit values #177

danielmisrael commented Oct 14, 2024

Tr1ple-F commented Dec 7, 2024 •

edited

Loading

Pythia 160M is giving unreasonable logit values #177

Pythia 160M is giving unreasonable logit values #177

Comments

danielmisrael commented Oct 14, 2024

Tr1ple-F commented Dec 7, 2024 • edited Loading

Tr1ple-F commented Dec 7, 2024 •

edited

Loading