-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLM common metrics for Generative AI #955
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A question/observation:
Should we go through and just use "Gen AI Model" instead of "LLM" throughout the content of this document? Naming attributes genai.*
but then having the descriptions talk about an LLM
feels a little inconsistent to me now.
I think we're already at the point where the same model supports multi-modal inputs and outputs. Consider the following in Claude's API reference:
Starting with Claude 3 models, you can also send image content blocks:
{"role": "user", "content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/jpeg",
"data": "/9j/4AAQSkZJRg...",
}
},
{"type": "text", "text": "What is in this image?"}
]}
And with OpenAI it's a little less straightforward, but still possible:
An array of content parts with a defined type, each can be of type `text` or `image_url` when passing in images. You can pass multiple images by adding multiple `image_url` content parts. Image input is only supported when using the `gpt-4-visual-preview` model.
It's orthogonal to this PR, but maybe it's good to start here and not "limit" ourselves by using the LLM
terminology, since it's usually associated with just text interpretation and generation?
I agree both addressing this soon and doing it separate from this PR. I've updated anything I could that does not impact Spans yet (keeping this as the metrics PR). Let's create another PR to update the other references to LLM. |
07917b4
to
b351811
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Left a few minor-ish comments
Thanks! All resolved. |
Fixes #811
Changes
This adds initial metric definitions to the current set of gen_ai semantic conventions. These initial two metrics (gen_ai.usage.tokens and gen_ai.request.duration) are a minimal set to get started, and more can be added with future PRs.
Merge requirement checklist
[chore]