Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture number of tokens in a request and response when possible #373

Open
jwmatthews opened this issue Sep 17, 2024 · 4 comments
Open

Capture number of tokens in a request and response when possible #373

jwmatthews opened this issue Sep 17, 2024 · 4 comments
Assignees
Milestone

Comments

@jwmatthews
Copy link
Member

We've run into a few situations where it would benefit us if we had a better view of the number of tokens consumed in a request and response.

Let's augment the data we are capturing for tracing and add in any extra info we may get back from the LLM via 'response_metadata'.

Current understanding is that for some models, the response includes metadata that breaks out the number of tokens used in the request and the response.

@jwmatthews
Copy link
Member Author

@devjpt23 has begun to work on this issue. I wasn't yet able to formally assign him to this issue.

Looks like I can only assign issues to folks in the Konveyor Org, so formed a new team of 'Collaborators' and invited @devjpt23 to that so he can be assigned future issues.

@jwmatthews
Copy link
Member Author

#375 adds the ability to log token request/response usage on successful calls for some models that send back a 'token_usage' in response metadata.

We would like to extend the capability beyond what #375 offers.

  • Pre-compute a guess at the tokens consumed in a prompt, prior to sending and log it.
  • On a failure response, check if there is any response metadata on token usage we can find
  • Explore other providers that are not showing token usage as per Add metadata to tracing #375, Amazon Bedrock is one provider which didn't log token data as per Add metadata to tracing #375

@jwmatthews
Copy link
Member Author

Let's leave this open, we need someone to dive back into this area and assess we have not lost this ability after the refactoring to agentic workflows.

We are tracking #568 to have someone go back over guidance for troubleshooting/trace information with the new agentic workflows, this is a good contender to add to that task

@shawn-hurley
Copy link
Contributor

removing from next up then, as I think if it is a task of the other one or highly related, then it shouldn't be in next up, if you disagree let me know

@shawn-hurley shawn-hurley removed the priority/nextup Issues we want to address soon label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

4 participants