Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added new AWS Attributes for various services #1794

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

AsakerMohd
Copy link

@AsakerMohd AsakerMohd commented Jan 23, 2025

Changes

Added new AWS Attributes for various services including SQS, SNS, Step Functions, Bedrock as well as a new attribute for Lambda.

Note: if the PR is touching an area that is not listed in the existing areas, or the area does not have sufficient domain experts coverage, the PR might be tagged as experts needed and move slowly until experts are identified.

Merge requirement checklist

@AsakerMohd AsakerMohd requested review from a team as code owners January 23, 2025 06:26
@github-actions github-actions bot added the enhancement New feature or request label Jan 23, 2025
Copy link
Contributor

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review existing semantic conventions:

  • there are several existing attributes that should be used instead of AAWS-specific ones
  • some attributes should be defined as generic ones

We avoid defining attributes alone without also defining spans, metrics, events, or resources that reference them, so please add corresponding signal definitions

brief: >
The unique identifier of the AWS Bedrock Guardrail. A [guardrail](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) helps safeguard and prevent unwanted behavior from model responses or user messages.
examples: ["sgi5gkybzqak"]
- id: aws.bedrock.knowledge_base.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd need a generic attribute for this one, but it's not defined yet. I'd be fine with adding one, but it's likely to be deprecated in the future.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the generic one is likely to be deprecated in the future, I'm in favor of keeping these as AWS specific ones. Does that work?

The name of the AWS Kinesis [stream](https://docs.aws.amazon.com/streams/latest/dev/introduction.html) the request refers to. Corresponds to the `--stream-name` parameter of the
Kinesis [describe-stream](https://docs.aws.amazon.com/cli/latest/reference/kinesis/describe-stream.html) operation.
examples: ["some-stream-name"]
- id: registry.aws.stepfunctions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @open-telemetry/lambda-extension-approvers

brief: >
This document defines attributes for AWS SQS.
attributes:
- id: aws.sqs.queue.url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we record server.address (host), server.port (port) and messaging.destination.name (queue name) in generic messaging semantic conventions, would url be still useful in this case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can probably still use messaging.destination.name in this case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking a look at our use case, we can end up having both the queue URL and the queue name. Using the same attribute for both will not work in this case :(

@@ -481,3 +488,96 @@ groups:
and [upload-part-copy](https://docs.aws.amazon.com/cli/latest/reference/s3api/upload-part-copy.html) operations.
The `part_number` attribute corresponds to the `--part-number` parameter of the
[upload-part operation within the S3 API](https://docs.aws.amazon.com/cli/latest/reference/s3api/upload-part.html).
- id: registry.aws.sqs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we need any aws.sqs attributes on top of generic messaging, please add a span definition for it and reference all applicable attributes, see kafka yaml and md conventions as an example.

Copy link
Author

@AsakerMohd AsakerMohd Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add those as well.

brief: >
The ARN of the AWS Step Functions State Machine.
examples: ["arn:aws:states:us-east-1:123456789012:stateMachine:myStateMachine:1"]
- id: registry.aws.bedrock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @open-telemetry/semconv-genai-approvers and specifically @xrmx who's adding bedrock instrumentation in python

https://github.com/search?q=repo%3Aopen-telemetry%2Fopentelemetry-python-contrib+bedrock&type=pullrequests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pinging me, I think I won't need any of the proposed attributes in my work atm (and so have not much clue about them) but I'll keep an eye on this PR.

@@ -481,3 +488,96 @@ groups:
and [upload-part-copy](https://docs.aws.amazon.com/cli/latest/reference/s3api/upload-part-copy.html) operations.
The `part_number` attribute corresponds to the `--part-number` parameter of the
[upload-part operation within the S3 API](https://docs.aws.amazon.com/cli/latest/reference/s3api/upload-part.html).
- id: registry.aws.sqs
type: attribute_group
display_name: Amazon SQS Attributes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @open-telemetry/semconv-messaging-approvers

Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added Stale and removed Stale labels Feb 13, 2025
@AsakerMohd AsakerMohd requested review from a team as code owners February 14, 2025 08:31
Copy link

github-actions bot commented Mar 2, 2025

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Mar 2, 2025
brief: >
The unique identifier of the AWS Bedrock Guardrail. A [guardrail](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) helps safeguard and prevent unwanted behavior from model responses or user messages.
examples: ["sgi5gkybzqak"]
- id: gen_ai.aws.bedrock.knowledge_base.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're slowly updating this repo to consistently namespace provider-specific attributes following #1708.

The suggestion here is to remove gen_ai prefix from gen_ai.aws.* attributes - this would keep it consistent with the upcoming formal guidance and also with existing aws attributes that don't have any prefix.

@@ -218,6 +218,11 @@ groups:
Client-side operations are actions taken on the user's end or within the client application.
Datastore: A tool used by the agent to access and query structured or unstructured external data for retrieval-augmented tasks or knowledge updates.
examples: ['function', 'extension', 'datastore']
- id: gen_ai.data_source.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting quick research on the terminology used by others:

So gen_ai.data_source.id sounds good to me.

- id: gen_ai.data_source.id
stability: development
type: string
brief: The data source identifier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we explain it a bit more? E.g. something along these lines:

Suggested change
brief: The data source identifier.
brief: The data source identifier.
note: Data sources are used by AI agents and RAG applications to store grounding data. A data source may be an external database, object store, document collection, website, or any other storage system used by the GenAI agent or application. The `gen_ai.data_source.id` SHOULD match the identifier used by the GenAI system rather than a name specific to the external storage, such as a database or object store. Semantic conventions referencing `gen_ai.data_source.id` MAY also leverage additional attributes, such as `db.*`, to further identify and describe the data source.

brief: >
The unique identifier of the AWS Bedrock Guardrail. A [guardrail](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) helps safeguard and prevent unwanted behavior from model responses or user messages.
examples: ["sgi5gkybzqak"]
- id: gen_ai.aws.bedrock.knowledge_base.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a difference between knowledge_base and data_source? do we need both ids?

Copy link
Contributor

@xrmx xrmx Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From AWS doc "A data source contains files or content with information that can be retrieved when your knowledge base is queried.", so two different ids sounds legit to track the source of the data when you have more than one data source in your knowledge base (up to 5 from a 2024 post).

requirement_level: recommended

- id: span.aws.sns.client
extends: span.aws.client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the result looks right in the markdown - the span does not follow messaging semantic conventions. Take a look at other messaging spans - https://github.com/open-telemetry/semantic-conventions/tree/main/docs/messaging they inherit from one of the messaging groups and include typical messaging attributes. Here's e.g. kafka yaml


- id: span.aws.stepfunctions.client
extends: span.aws.client
type: span
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need those defined as spans? I don't think any of them is referenced in markdown and given "Attributes that exist for Step Functions request types. explanation, it sounds like this group does not describe any span.

If you want to document them as spans, could you please document them in markdown including span name format and link from aws-sdk page?

requirement_level: required
- ref: gen_ai.aws.bedrock.knowledge_base.id
requirement_level: recommended
- ref: gen_ai.data_source.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be referenced in span.gen_ai.client.create_agent since it's applicable to other agents, not just bedrock

@@ -52,6 +52,7 @@ and the [cloud resource conventions][cloud]. The following AWS Lambda-specific a
| Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability |
|---|---|---|---|---|---|
| [`aws.lambda.invoked_arn`](/docs/attributes-registry/aws.md) | string | The full invoked ARN as provided on the `Context` passed to the function (`Lambda-Runtime-Invoked-Function-Arn` header on the `/runtime/invocation/next` applicable). [1] | `arn:aws:lambda:us-east-1:123456:function:myfunction:myalias` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) |
| [`aws.lambda.resource_mapping.id`](/docs/attributes-registry/aws.md) | string | The UUID of the [AWS Lambda EvenSource Mapping](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-lambda-eventsourcemapping.html). An event source is mapped to a lambda function. It's contents are read by Lambda and used to trigger a function. | `587ad24b-03b9-4413-8202-bbd56b36e5b7` | `Recommended` | ![Development](https://img.shields.io/badge/-development-blue) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @open-telemetry/lambda-extension-approvers could you please take a look?

linkTitle: AWS Bedrock
--->

# Semantic conventions for AWS Bedrock operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -297,3 +297,57 @@ groups:
requirement_level: recommended
- ref: aws.s3.part_number
requirement_level: recommended

- id: span.aws.sqs.client
extends: span.aws.client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFYI: there is an active issue against this span definition - #1545, from semconv perspective we'd like to fix it at some point which may cause breaking changes if you use rpc.* attributes in AWS SDK instrumentations.

would appreciate your input on this.

@github-actions github-actions bot removed the Stale label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

4 participants