Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs to explain how to use tokenizer field for chat prompt formats #1476

Open
horsten opened this issue Sep 18, 2024 · 2 comments
Open
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@horsten
Copy link

horsten commented Sep 18, 2024

Bug description

In README.md, it's stated that the prompts used in production for HuggingChat can be found in PROMPTS.md.

However, PROMPTS.md has not been updated for 7 months and there are several prompts missing for newer models.

@horsten horsten added the bug Something isn't working label Sep 18, 2024
@nsarrazin
Copy link
Collaborator

Hi! You can now use the tokenizer to format your chat template. This is what we do in production (see here) and is the reason why we haven't updated the prompts recently. Will update the docs to better mention this.

@nsarrazin nsarrazin changed the title Please update PROMPTS.md with the production prompts added to HuggingFace Update docs to explain how to use tokenizer field for chat prompt formats Sep 20, 2024
@nsarrazin nsarrazin added the documentation Improvements or additions to documentation label Sep 20, 2024
@horsten
Copy link
Author

horsten commented Sep 20, 2024

Thanks, I suspected you were doing something like that. My problem is that I'm using a kind of hackish and hybrid solution to get the tools support working. I'm running inference on fireworks.ai, where there's only their version of the OpenAI API endpoint and their own custom one. Neither provides me with sufficient control with the template, so I had to use the completions API instead of chat_completions and format the template in chat-ui using JavaScript (it's trivial in Python since the standard template is easy to apply there with Jinja2, but less so in JS). But I implemented a hardcoded template-generator that will do for now.

I didn't think I could just load the tokenizer unless I was using a TGI endpoint but maybe I'm wrong? A quick attempt doesn't look like it.

    err: {
      "type": "TypeError",
      "message": "this.added_tokens.toSorted is not a function",
      "stack":
          TypeError: this.added_tokens.toSorted is not a function
              at new PreTrainedTokenizer (file:///home/th/sec/src/llmweb/chat-ui/node_modules/@huggingface/transformers/dist/transformers.mjs:22814:18)
              at Module.getTokenizer (/home/th/sec/src/llmweb/chat-ui/src/lib/utils/getTokenizer.ts:12:12)
[12:01:32.743] ERROR (2642393): Failed to load tokenizer for model accounts/fireworks/models/llama-v3p1-8b-instruct consider setting chatPromptTemplate manually or making sure the model is available on the hub.
              at Module.getTokenizer (/home/th/sec/src/llmweb/chat-ui/src/lib/utils/getTokenizer.ts:12:12)
              at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
              at async Object.endpointOai [as openai] (/home/th/sec/src/llmweb/chat-ui/src/lib/server/endpoints/openai/endpointOai.ts:76:19)
              at async Object.getEndpoint (/home/th/sec/src/llmweb/chat-ui/src/lib/server/models.ts:270:20)
              at async Object.start (/home/th/sec/src/llmweb/chat-ui/src/routes/conversation/[id]/+server.ts:330:21)
    }

(I put "tokenizer": {"tokenizerUrl": "https://huggingface.co/nsarrazin/llama3.1-tokenizer/resolve/main/tokenizer.json", "tokenizerConfigUrl": "https://huggingface.co/nsarrazin/llama3.1-tokenizer/raw/main/tokenizer_config.json"} in my model definiton, and I checked that I can get the files with curl from the server, and tried to get the tokenizer with tokenizer = await getTokenizer(m.tokenizer) in the same way it's done in models.ts).

EDIT: I now see it's not that and it should work, but it's some kind of dependency issue which could hint at the dependencies in package.json needing an update? EDIT2: I found that "npm upgrade @hugginface/transfomers" was enough, and I now have the tokenizer working, so I can scrap the ugly hack I'd made for template generation.

Can you provide any insights on how I should get tools support integrated "cleanly" in my scenario? I'm currently using a bunch of hacks based on (outdated) documentation guessing, and experimentation and it could work better...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants