Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop converting nanosecond metrics #1322

Merged
merged 2 commits into from
Sep 3, 2024
Merged

Conversation

dricross
Copy link
Contributor

@dricross dricross commented Aug 29, 2024

Description of the issue

Currently, nanosecond metrics are converted to microseconds before being published to CloudWatch because CloudWatch does not support nanosecond unit. The closest unit is microseconds.

Unfortunately, this implicit conversion can be confusing for metrics with ns in the name like kafka.producer.io-wait-time-ns-avg because the value/unit in CloudWatch will actually be Microseconds which is inconsistent with the metric name. This also results in a loss of precision.

Description of changes

This change removes the conversion of nanosecond metrics. Instead, they are published unmodified with a unit of "None".

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

Kicked off integration tests for that branch. Checked the metric_value_benchmark integ test outputs, search for kafka.producer.io-wait-time-ns-avg, find log messages like:

null_resource.integration_test_run (remote-exec): 2024/08/29 17:05:18 Metric query input dimensions
null_resource.integration_test_run (remote-exec): 2024/08/29 17:05:18 	Dimensions:
null_resource.integration_test_run (remote-exec): 2024/08/29 17:05:18 		Dim(name="InstanceId", val="i-078550b055ab2cccb")
null_resource.integration_test_run (remote-exec): 2024/08/29 17:05:18 Metric data input: namespace MetricValueBenchmarkJMXTest, name kafka.producer.io-wait-time-ns-avg, stat Average, period 10
null_resource.integration_test_run (remote-exec): 2024/08/29 17:05:18 Metric values are : [643476.2933982722 1.3033429029366784e+06 1.5008771342337618e+07]
null_resource.integration_test_run (remote-exec): 2024/08/29 17:05:18 metric values are [643476.2933982722 1.3033429029366784e+06 1.5008771342337618e+07]

Plot the kafka.producer.io-wait-time-ns-avg metric with that instance ID in CloudWatch. Do that for a few OS types for both my branch and main branch to compare. The updated metrics are about 1000x larger, which is expected. Also noted that the updated metrics do not have a unit associated.

Requirements

Before commit the code, please do the following steps.

  1. Run make fmt and make fmt-sh
  2. Run make lint

@dricross dricross requested a review from a team as a code owner August 29, 2024 18:48
@dricross dricross requested review from jefchien, lisguo and mitali-salvi and removed request for lisguo August 29, 2024 18:48
@dricross dricross merged commit c79d9bc into main Sep 3, 2024
238 of 342 checks passed
@dricross dricross deleted the dricross/remove-ns-conversion branch September 3, 2024 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants