-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added new AWS Attributes for various services #1794
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please review existing semantic conventions:
- there are several existing attributes that should be used instead of AAWS-specific ones
- some attributes should be defined as generic ones
We avoid defining attributes alone without also defining spans, metrics, events, or resources that reference them, so please add corresponding signal definitions
model/aws/registry.yaml
Outdated
brief: > | ||
The unique identifier of the AWS Bedrock Guardrail. A [guardrail](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) helps safeguard and prevent unwanted behavior from model responses or user messages. | ||
examples: ["sgi5gkybzqak"] | ||
- id: aws.bedrock.knowledge_base.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd need a generic attribute for this one, but it's not defined yet. I'd be fine with adding one, but it's likely to be deprecated in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the generic one is likely to be deprecated in the future, I'm in favor of keeping these as AWS specific ones. Does that work?
The name of the AWS Kinesis [stream](https://docs.aws.amazon.com/streams/latest/dev/introduction.html) the request refers to. Corresponds to the `--stream-name` parameter of the | ||
Kinesis [describe-stream](https://docs.aws.amazon.com/cli/latest/reference/kinesis/describe-stream.html) operation. | ||
examples: ["some-stream-name"] | ||
- id: registry.aws.stepfunctions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @open-telemetry/lambda-extension-approvers
brief: > | ||
This document defines attributes for AWS SQS. | ||
attributes: | ||
- id: aws.sqs.queue.url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we record server.address
(host), server.port
(port) and messaging.destination.name
(queue name) in generic messaging semantic conventions, would url be still useful in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can probably still use messaging.destination.name
in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking a look at our use case, we can end up having both the queue URL and the queue name. Using the same attribute for both will not work in this case :(
@@ -481,3 +488,96 @@ groups: | |||
and [upload-part-copy](https://docs.aws.amazon.com/cli/latest/reference/s3api/upload-part-copy.html) operations. | |||
The `part_number` attribute corresponds to the `--part-number` parameter of the | |||
[upload-part operation within the S3 API](https://docs.aws.amazon.com/cli/latest/reference/s3api/upload-part.html). | |||
- id: registry.aws.sqs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we need any aws.sqs attributes on top of generic messaging, please add a span definition for it and reference all applicable attributes, see kafka yaml and md conventions as an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add those as well.
model/aws/registry.yaml
Outdated
brief: > | ||
The ARN of the AWS Step Functions State Machine. | ||
examples: ["arn:aws:states:us-east-1:123456789012:stateMachine:myStateMachine:1"] | ||
- id: registry.aws.bedrock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @open-telemetry/semconv-genai-approvers and specifically @xrmx who's adding bedrock instrumentation in python
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pinging me, I think I won't need any of the proposed attributes in my work atm (and so have not much clue about them) but I'll keep an eye on this PR.
@@ -481,3 +488,96 @@ groups: | |||
and [upload-part-copy](https://docs.aws.amazon.com/cli/latest/reference/s3api/upload-part-copy.html) operations. | |||
The `part_number` attribute corresponds to the `--part-number` parameter of the | |||
[upload-part operation within the S3 API](https://docs.aws.amazon.com/cli/latest/reference/s3api/upload-part.html). | |||
- id: registry.aws.sqs | |||
type: attribute_group | |||
display_name: Amazon SQS Attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @open-telemetry/semconv-messaging-approvers
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
This PR was marked stale due to lack of activity. It will be closed in 7 days. |
brief: > | ||
The unique identifier of the AWS Bedrock Guardrail. A [guardrail](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) helps safeguard and prevent unwanted behavior from model responses or user messages. | ||
examples: ["sgi5gkybzqak"] | ||
- id: gen_ai.aws.bedrock.knowledge_base.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're slowly updating this repo to consistently namespace provider-specific attributes following #1708.
The suggestion here is to remove gen_ai
prefix from gen_ai.aws.*
attributes - this would keep it consistent with the upcoming formal guidance and also with existing aws
attributes that don't have any prefix.
@@ -218,6 +218,11 @@ groups: | |||
Client-side operations are actions taken on the user's end or within the client application. | |||
Datastore: A tool used by the agent to access and query structured or unstructured external data for retrieval-augmented tasks or knowledge updates. | |||
examples: ['function', 'extension', 'datastore'] | |||
- id: gen_ai.data_source.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documenting quick research on the terminology used by others:
- Google calls them data stores
- OpenAI uses vector stores,
- Azure OpenAI uses data sources
- Llamaindex calls them data sources
So gen_ai.data_source.id
sounds good to me.
- id: gen_ai.data_source.id | ||
stability: development | ||
type: string | ||
brief: The data source identifier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we explain it a bit more? E.g. something along these lines:
brief: The data source identifier. | |
brief: The data source identifier. | |
note: Data sources are used by AI agents and RAG applications to store grounding data. A data source may be an external database, object store, document collection, website, or any other storage system used by the GenAI agent or application. The `gen_ai.data_source.id` SHOULD match the identifier used by the GenAI system rather than a name specific to the external storage, such as a database or object store. Semantic conventions referencing `gen_ai.data_source.id` MAY also leverage additional attributes, such as `db.*`, to further identify and describe the data source. |
brief: > | ||
The unique identifier of the AWS Bedrock Guardrail. A [guardrail](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) helps safeguard and prevent unwanted behavior from model responses or user messages. | ||
examples: ["sgi5gkybzqak"] | ||
- id: gen_ai.aws.bedrock.knowledge_base.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a difference between knowledge_base and data_source? do we need both ids?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From AWS doc "A data source contains files or content with information that can be retrieved when your knowledge base is queried.", so two different ids sounds legit to track the source of the data when you have more than one data source in your knowledge base (up to 5 from a 2024 post).
requirement_level: recommended | ||
|
||
- id: span.aws.sns.client | ||
extends: span.aws.client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the result looks right in the markdown - the span does not follow messaging semantic conventions. Take a look at other messaging spans - https://github.com/open-telemetry/semantic-conventions/tree/main/docs/messaging they inherit from one of the messaging groups and include typical messaging attributes. Here's e.g. kafka yaml
|
||
- id: span.aws.stepfunctions.client | ||
extends: span.aws.client | ||
type: span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need those defined as spans? I don't think any of them is referenced in markdown and given "Attributes that exist for Step Functions request types.
explanation, it sounds like this group does not describe any span.
If you want to document them as spans, could you please document them in markdown including span name format and link from aws-sdk page?
requirement_level: required | ||
- ref: gen_ai.aws.bedrock.knowledge_base.id | ||
requirement_level: recommended | ||
- ref: gen_ai.data_source.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be referenced in span.gen_ai.client.create_agent
since it's applicable to other agents, not just bedrock
@@ -52,6 +52,7 @@ and the [cloud resource conventions][cloud]. The following AWS Lambda-specific a | |||
| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | | |||
|---|---|---|---|---|---| | |||
| [`aws.lambda.invoked_arn`](/docs/attributes-registry/aws.md) | string | The full invoked ARN as provided on the `Context` passed to the function (`Lambda-Runtime-Invoked-Function-Arn` header on the `/runtime/invocation/next` applicable). [1] | `arn:aws:lambda:us-east-1:123456:function:myfunction:myalias` | `Recommended` |  | | |||
| [`aws.lambda.resource_mapping.id`](/docs/attributes-registry/aws.md) | string | The UUID of the [AWS Lambda EvenSource Mapping](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-lambda-eventsourcemapping.html). An event source is mapped to a lambda function. It's contents are read by Lambda and used to trigger a function. | `587ad24b-03b9-4413-8202-bbd56b36e5b7` | `Recommended` |  | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @open-telemetry/lambda-extension-approvers could you please take a look?
linkTitle: AWS Bedrock | ||
---> | ||
|
||
# Semantic conventions for AWS Bedrock operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a link to this (and other new AWS docs) from https://github.com/open-telemetry/semantic-conventions/blob/78c42c576a25743902192466cf7ff81889bf3630/docs/cloud-providers/aws-sdk.md#aws-service-specific-attributes
@@ -297,3 +297,57 @@ groups: | |||
requirement_level: recommended | |||
- ref: aws.s3.part_number | |||
requirement_level: recommended | |||
|
|||
- id: span.aws.sqs.client | |||
extends: span.aws.client |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JFYI: there is an active issue against this span definition - #1545, from semconv perspective we'd like to fix it at some point which may cause breaking changes if you use rpc.*
attributes in AWS SDK instrumentations.
would appreciate your input on this.
Changes
Added new AWS Attributes for various services including SQS, SNS, Step Functions, Bedrock as well as a new attribute for Lambda.
Note: if the PR is touching an area that is not listed in the existing areas, or the area does not have sufficient domain experts coverage, the PR might be tagged as experts needed and move slowly until experts are identified.
Merge requirement checklist
[chore]