You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've noticed recently in a high-load project I'm involved in that not all DynamoDB spans are reaching our APM. I have traced this issue down to the Activity field in the AWSTracingPipelineHandler.
When the same AmazonDynamoDBClient is used concurrently to perform multiple async requests to Dynamo, this field gets overwritten by the most recent request. The Activities for the earlier requests become orphaned. The orphaned activities are never stopped and therefore never reach the collector. We confirmed that the same thing happens with concurrent requests to AmazonSQSClient, and I'm pretty sure to any other client with a RuntimePipeline. Note that AWS clients are designed to be able to handle concurrent requests, and work as expected without the OpenTelemetry instrumentation.
The fix is to make the Activity field AsyncLocal. I did test this fix by including the latest Main of this repository into our project instead of the NuGet package and fixing it locally. I can confirm that it solves the problem of disappearing spans.
Below is a patch with my fix. I can create a pull request if you prefer it that way.
Hey @snake-scaly, this has been fixed in the latest version of the Opentelemetry.Instrumentation.AWS here. We are currently working on bringing those changes to our package.
We've noticed recently in a high-load project I'm involved in that not all DynamoDB spans are reaching our APM. I have traced this issue down to the Activity field in the
AWSTracingPipelineHandler
.When the same
AmazonDynamoDBClient
is used concurrently to perform multiple async requests to Dynamo, this field gets overwritten by the most recent request. The Activities for the earlier requests become orphaned. The orphaned activities are never stopped and therefore never reach the collector. We confirmed that the same thing happens with concurrent requests toAmazonSQSClient
, and I'm pretty sure to any other client with a RuntimePipeline. Note that AWS clients are designed to be able to handle concurrent requests, and work as expected without the OpenTelemetry instrumentation.The fix is to make the Activity field AsyncLocal. I did test this fix by including the latest Main of this repository into our project instead of the NuGet package and fixing it locally. I can confirm that it solves the problem of disappearing spans.
Below is a patch with my fix. I can create a pull request if you prefer it that way.
The text was updated successfully, but these errors were encountered: