feat: add perplexity example #313

AlpinDale · 2024-03-12T19:54:45Z

This PR adds an example to calculate model perplexity, this approach performs it by using prompt_logprobs:

Generate text using a dataset and extract 1 prompt logprob per token.
Calculate the mean of the logprobs.
Calculate the exponent of the negative mean of the logprobs.
Divide the result by 2.

The last step should technically be unneeded, but to match the perplexity results from llama.cpp, we seem to need the division by two. It still seems unreliable, as Llama-2 7B shows a lower ppx than Mistral 7B, and FP8 KV Cache doesn't change ppx at all.

Israel-Laguan · 2024-08-20T20:46:49Z

examples/perplexity.py

+# Create a tokenizer.
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+
+# Tokenize the prompts and discard or truncate any prompts longer than 2048 tokens.


max_length is different below, maybe you meant 4096?

add perplexity example

2b5af25

Israel-Laguan reviewed Aug 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add perplexity example #313

feat: add perplexity example #313

AlpinDale commented Mar 12, 2024

Israel-Laguan Aug 20, 2024

feat: add perplexity example #313

Are you sure you want to change the base?

feat: add perplexity example #313

Conversation

AlpinDale commented Mar 12, 2024

Israel-Laguan Aug 20, 2024

Choose a reason for hiding this comment