Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Why llamafile don't remove end token like <|eot_id|> or <end_of_turn>? #630

Open
jeezrick opened this issue Nov 15, 2024 · 2 comments

Comments

@jeezrick
Copy link

Contact Details

[email protected]

What happened?

When I use llamafile with python api. But for 2 models I use, they all retain the end token in response string, that I need to manually remove, is that my problem?
like this :

        if self.model_string == "LLaMA_CPP": # why llama_file don't remove end token?
            self.response_str = self.response_str.replace("<|eot_id|>", "")
        if self.model_string == "gemma-2b-it":
            self.response_str = self.response_str.replace("<end_of_turn>", "")

Version

llamafile v0.8.4

What operating system are you seeing the problem on?

Linux

Relevant log output

model_gemma("I have a head of broccoli, and a cabbage. How many fruits do I have?")

output:

'You have **zero** fruits! 🥦 🥬 \n\nBroccoli and cabbage are both vegetables, not fruits. \n<end_of_turn>'
@yusufsyaifudin
Copy link

I also encounter this issue and end up using Ollama again (want to use llamafile because it has tokenize/detokenize API)

here's the reproducible scripts:

ollama pull llama3.2:3b-instruct-q5_K_M
./llamafile -m /Users/username/.ollama/models/blobs/sha256-05fc42664a9311c427413f9bf2077bd5ee7d59d6a5a034d54fc738f93976d065 --server --nobrowser

Then call some chat completions using API:

curl --location 'http://127.0.0.1:8080/v1/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
  "stream": false,
  "messages": [
    {
        "role": "system",
        "content": "You'\''re a helpful assistant!"
    },
    {
        "role": "user",
        "content": "Why sky is blue?"
    }
  ],
  "temperature": 0.1,
  "cache_prompt": true
}'

Will return:

{
    "choices": [
        {
            "finish_reason": "stop",
            "index": 0,
            "message": {
                "content": "The sky appears blue to us because of a phenomenon called Rayleigh scattering. Here's a simplified explanation:\n\n1. **Sunlight and its components**: When sunlight enters Earth's atmosphere, it's made up of different colors, which are a result of the different wavelengths of light. These colors include red, orange, yellow, green, blue, indigo, and violet.\n2. **Scattering by tiny molecules**: The atmosphere is filled with tiny molecules of gases like nitrogen (N2) and oxygen (O2). When sunlight hits these molecules, it scatters in all directions.\n3. **Shorter wavelengths scatter more**: The smaller wavelengths of light, like blue and violet, are scattered more than the longer wavelengths, like red and orange. This is because the smaller molecules are more effective at scattering the shorter wavelengths.\n4. **Our eyes perceive the scattered light**: As the scattered light reaches our eyes, we see the sky as blue because our eyes are most sensitive to the blue and violet wavelengths. The scattered light is more intense in the blue and violet parts of the spectrum, making the sky appear blue to us.\n5. **The blue color we see is a result of the scattering**: The blue color we see is not actually the color of the light itself, but rather the result of the scattering of sunlight by the tiny molecules in the atmosphere.\n\nThis is why the sky appears blue during the daytime, especially in the direction of the sun. At sunrise and sunset, the light has to travel through more of the atmosphere, which scatters the shorter wavelengths even more, making the sky appear more red or orange.\n\nI hope that helps you understand why the sky is blue!<|eot_id|>",
                "role": "assistant"
            }
        }
    ],
    "created": 1732441733,
    "id": "chatcmpl-PeSIXi0WMsbggQ4INjtfqtkHXY1qq8cD",
    "model": "unknown",
    "object": "chat.completion",
    "usage": {
        "completion_tokens": 340,
        "prompt_tokens": 26,
        "total_tokens": 366
    }
}

[PS] 05fc42664a9311c427413f9bf2077bd5ee7d59d6a5a034d54fc738f93976d065 is the digest of llama3.2:3b-instruct-q5_K_M model.

I am using the llamafile 0.8.16

@pawel665j
Copy link

pawel665j commented Nov 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants