From f0869ea007e31dfdd41d4d20c3e9b5475918ac54 Mon Sep 17 00:00:00 2001 From: Mateusz Rzeszutek Date: Mon, 17 May 2021 19:11:48 +0200 Subject: [PATCH] Update documentation according to GDI spec (#226) --- MIGRATING.md | 110 ++++++++++++++++++++++------------------ README.md | 26 ++++++---- docs/advanced-config.md | 66 ++++++++++++++++-------- docs/troubleshooting.md | 44 ++++++++++++++-- 4 files changed, 165 insertions(+), 81 deletions(-) diff --git a/MIGRATING.md b/MIGRATING.md index f5eaa4337..33d413a0b 100644 --- a/MIGRATING.md +++ b/MIGRATING.md @@ -29,36 +29,51 @@ Follow these steps to migrate from the SignalFx Java Agent to the Splunk distribution of Splunk Distribution of OpenTelemetry Java Instrumentation: 1. Download the [latest release](https://github.com/signalfx/splunk-otel-java/releases/latest/download/splunk-otel-javaagent-all.jar) - of the Splunk Distribution of OpenTelemetry Java Instrumentation. For example use: + of the Splunk Distribution of OpenTelemetry Java Instrumentation. For example use: ```bash - $ # download the newest version of the agent - $ curl -vsSL -o splunk-otel-javaagent-all.jar 'https://github.com/signalfx/splunk-otel-java/releases/latest/download/splunk-otel-javaagent-all.jar' + curl -vsSL -o splunk-otel-javaagent-all.jar 'https://github.com/signalfx/splunk-otel-java/releases/latest/download/splunk-otel-javaagent-all.jar' ``` 2. Set the service name. This is how you can identify the service in Splunk APM. An example how to set it using an environment variable: ```bash - $ EXPORT OTEL_RESOURCE_ATTRIBUTES=service.name=my-java-app + export OTEL_RESOURCE_ATTRIBUTES=service.name=my-java-app ``` or a system property: ``` -Dotel.resource.attributes=service.name=my-java-app ``` -3. Specify the endpoint of the SignalFx Smart Agent or OpenTelemetry Collector - you're exporting traces to. You can set the endpoint with a system property - or environment variable. - - An example how to set it using an environment variable: +3. Specify the endpoint of the OpenTelemetry Collector or SignalFx Smart Agent you're exporting traces to. + Depending on which one you use, you might have to switch the trace exporter. The Splunk Distribution of + OpenTelemetry Java Instrumentation uses the OTLP traces exporter as the default - which is only supported by the + OpenTelemetry Collector. If you wish to use the SignalFx Smart Agent, you have to switch to the Jaeger exporter. + + You can configure the trace exporter with a system property or environment variable. + + An example of setting the OTLP endpoint using an environment variable: + ```bash + export OTEL_EXPORTER_OTLP_ENDPOINT="http://yourEndpoint:4317" + ``` + or a system property: + ``` + -Dotel.exporter.jaeger.endpoint=http://yourEndpoint:4317 ``` - $ EXPORT OTEL_EXPORTER_JAEGER_ENDPOINT="http://yourEndpoint:9080/v1/trace" + The default value is `http://localhost:4317/v1/trace`. If you're exporting traces to an OpenTelemetry Collector + deployed on localhost, you don't have to modify this configuration setting. + + To use the Jaeger exporter you must additionally set the `otel.traces.exporter` configuration option. + + An example of how to do that using an environment variable: + ```bash + export OTEL_TRACES_EXPORTER="jaeger-thrift-splunk" + export OTEL_EXPORTER_JAEGER_ENDPOINT="http://yourEndpoint:9080/v1/trace" ``` or a system property: ``` - -Dotel.exporter.jaeger.endpoint=http://yourEndpoint:9080/v1/trace + -Dotel.traces.exporter=jaeger-thrift-splunk -Dotel.exporter.jaeger.endpoint=http://yourEndpoint:9080/v1/trace ``` - The default value is `http://localhost:9080/v1/trace`. If you're exporting - traces to a local Smart Agent, you don't have to modify this configuration - setting. + The default value for the Jaeger endpoint is `http://localhost:9080/v1/trace`. If you're exporting traces to a local + Smart Agent, you don't have to modify this configuration setting. 4. In your application startup script, replace `-javaagent:./signalfx-tracing.jar` with `-javaagent:/path/to/splunk-otel-javaagent-all.jar`. 5. If you manually instrumented any code with an OpenTracing tracer, expose @@ -78,52 +93,51 @@ OpenTelemetry Java Instrumentation. ### Configuration setting changes -These SignalFx Java Agent system properties correspond to the following -OpenTelemetry system properties (NOTE: some properites are exporter-specific, the default is `jaeger`): +These SignalFx Java Agent system properties correspond to the following OpenTelemetry system properties: -| SignalFx system property | OpenTelemetry system property | -| ------------------------ | ----------------------------- | -| `signalfx.service.name` | `otel.resource.attributes=service.name=` | -| `signalfx.env` | `otel.resource.attributes=deployment.environment=` | -| `signalfx.endpoint.url` | `otel.exporter.jaeger.endpoint` | -| `signalfx.tracing.enabled` | `otel.javaagent.enabled` | -| `signalfx.integration..enabled=false` | `otel.instrumentation..enabled=false` | -| `signalfx.span.tags` | `otel.resource.attributes=` | -| `signalfx.trace.annotated.method.blacklist` | `otel.trace.annotated.methods.exclude` | -| `signalfx.trace.methods` | `otel.trace.methods` | -| `signalfx.server.timing.context` | `splunk.context.server-timing.enabled` | +| SignalFx system property | OpenTelemetry system property | +| ------------------------------------------- | ----------------------------- | +| `signalfx.service.name` | `otel.resource.attributes=service.name=` +| `signalfx.env` | `otel.resource.attributes=deployment.environment=` +| `signalfx.endpoint.url` | `otel.exporter.otlp.endpoint` or `otel.exporter.jaeger.endpoint`, depending on which trace exporter you're using (OTLP is the default one) +| `signalfx.tracing.enabled` | `otel.javaagent.enabled` +| `signalfx.integration..enabled=false` | `otel.instrumentation..enabled=false` +| `signalfx.span.tags` | `otel.resource.attributes=` +| `signalfx.trace.annotated.method.blacklist` | `otel.trace.annotated.methods.exclude` +| `signalfx.trace.methods` | `otel.trace.methods` +| `signalfx.server.timing.context` | `splunk.trace-response-header.enabled` -Note: when setting both `service name` and `environment` appropriate `otel.resource.attributes` property setting will -look like this: `otel.resource.attributes=service.name=myService,deployment.environment=myEnvironment` +Note: when setting both `service name` and `environment` appropriate `otel.resource.attributes` property setting will +look like this: `otel.resource.attributes=service.name=myService,deployment.environment=myEnvironment` Additional info about disabling a particular instrumentation can be found in the [OpenTelemetry Java Instrumentation docs](https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/suppressing-instrumentation.md). These SignalFx Java Agent environment variables correspond to the following OpenTelemetry environment variables: -| SignalFx environment variable | OpenTelemetry environment variable | -| ----------------------------- | ---------------------------------- | -| `SIGNALFX_SERVICE_NAME` | `OTEL_RESOURCE_ATTRIBUTES=service.name=` | -| `SIGNALFX_ENV` | `OTEL_RESOURCE_ATTRIBUTES=deployment.environment=` | -| `SIGNALFX_ENDPOINT_URL` |`OTEL_EXPORTER_JAEGER_ENDPOINT` | -| `SIGNALFX_TRACING_ENABLED` | `OTEL_JAVAAGENT_ENABLED` | -| `SIGNALFX_INTEGRATION__ENABLED=false` | `OTEL_INSTRUMENTATION__ENABLED=false` | -| `SIGNALFX_SPAN_TAGS` | `OTEL_RESOURCE_ATTRIBUTES` | -| `SIGNALFX_TRACE_ANNOTATED_METHOD_BLACKLIST` | `OTEL_TRACE_ANNOTATED_METHODS_EXCLUDE` | -| `SIGNALFX_TRACE_METHODS` | `OTEL_TRACE_METHODS` | -| `SIGNALFX_SERVER_TIMING_CONTEXT` | `SPLUNK_CONTEXT_SERVER_TIMING_ENABLED` | +| SignalFx environment variable | OpenTelemetry environment variable | +| ------------------------------------------- | ---------------------------------- | +| `SIGNALFX_SERVICE_NAME` | `OTEL_RESOURCE_ATTRIBUTES=service.name=` +| `SIGNALFX_ENV` | `OTEL_RESOURCE_ATTRIBUTES=deployment.environment=` +| `SIGNALFX_ENDPOINT_URL` | `OTEL_EXPORTER_OTLP_ENDPOINT` or `OTEL_EXPORTER_JAEGER_ENDPOINT`, depending on which trace exporter you're using (OTLP is the default one) +| `SIGNALFX_TRACING_ENABLED` | `OTEL_JAVAAGENT_ENABLED` +| `SIGNALFX_INTEGRATION__ENABLED=false` | `OTEL_INSTRUMENTATION__ENABLED=false` +| `SIGNALFX_SPAN_TAGS` | `OTEL_RESOURCE_ATTRIBUTES` +| `SIGNALFX_TRACE_ANNOTATED_METHOD_BLACKLIST` | `OTEL_TRACE_ANNOTATED_METHODS_EXCLUDE` +| `SIGNALFX_TRACE_METHODS` | `OTEL_TRACE_METHODS` +| `SIGNALFX_SERVER_TIMING_CONTEXT` | `SPLUNK_TRACE_RESPONSE_HEADER_ENABLED` These SignalFx Java Agent system properties and environment variables don't have corresponding configuration options with the Spunk Distribution for OpenTelemetry Java Instrumentation: -| System property | Environment variable | -| --------------- | -------------------- | -| `signalfx.agent.host` | `SIGNALFX_AGENT_HOST` | -| `signalfx.db.statement.max.length` | `SIGNALFX_DB_STATEMENT_MAX_LENGTH` | -| `signalfx.recorded.value.max.length` | `SIGNALFX_RECORDED_VALUE_MAX_LENGTH` | -| `signalfx.max.spans.per.trace` | `SIGNALFX_MAX_SPANS_PER_TRACE` | -| `signalfx.max.continuation.depth` | `SIGNALFX_MAX_CONTINUATION_DEPTH` | +| System property | Environment variable | +| ------------------------------------ | -------------------- | +| `signalfx.agent.host` | `SIGNALFX_AGENT_HOST` +| `signalfx.db.statement.max.length` | `SIGNALFX_DB_STATEMENT_MAX_LENGTH` +| `signalfx.recorded.value.max.length` | `SIGNALFX_RECORDED_VALUE_MAX_LENGTH` +| `signalfx.max.spans.per.trace` | `SIGNALFX_MAX_SPANS_PER_TRACE` +| `signalfx.max.continuation.depth` | `SIGNALFX_MAX_CONTINUATION_DEPTH` ### Log injection changes @@ -152,7 +166,7 @@ Distribution of OpenTelemetry Java Instrumentation, see The `@Trace` annotation that the SignalFx Java Agent uses is compatible with the Splunk Distribution of OpenTelemetry Java Instrumentation. If you're using the `@Trace` annotation for custom instrumentation, you don't have to make any -changes to maintain existing functionality. +changes to maintain existing functionality. If you want to configure new custom instrumentation and don't want to use the OpenTelemetry `getTracer` and API directly, use the OpenTelemetry `@WithSpan` diff --git a/README.md b/README.md index c838790e8..02b56774f 100644 --- a/README.md +++ b/README.md @@ -51,13 +51,13 @@ see [Migrate from the SignalFx Java Agent](MIGRATING.md). This distribution comes with the following defaults: -- [B3 context propagation](https://github.com/openzipkin/b3-propagation). -- [Jaeger-Thrift exporter](https://www.jaegertracing.io) - configured to send spans to a locally running [SignalFx Smart - Agent](https://docs.signalfx.com/en/latest/apm/apm-getting-started/apm-smart-agent.html) - (`http://localhost:9080/v1/trace`). -- Unlimited default limits for [configuration options](docs/advanced-config.md#trace-configuration) to - support full-fidelity traces. +- [W3C `tracecontext` context propagation](https://www.w3.org/TR/trace-context/). +- [OTLP traces exporter](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/README.md) + configured to send spans to a locally + running [Splunk OpenTelemetry Connector](https://github.com/signalfx/splunk-otel-collector) + (`http://localhost:4317`). +- Unlimited default limits for [configuration options](docs/advanced-config.md#trace-configuration) to support + full-fidelity traces. > :construction: This project is currently in **BETA**. It is **officially supported** by Splunk. However, breaking changes **MAY** be introduced. @@ -106,14 +106,22 @@ attribute as shown in the [example above](#getting-started). A couple other configuration options that may need to be changed or set are: -- Endpoint if not sending to a locally running Smart Agent with default - configuration. See the [Jaeger exporter](docs/advanced-config.md#jaeger-exporter) +- Endpoint if not sending to a locally running Splunk OpenTelemetry Connector with default + configuration. See the [exporters](docs/advanced-config.md#trace-exporters) configuration documentation for more information. - Environment resource attribute `deployment.environment` to specify what environment the span originated from. For example: ``` -Dotel.resource.attributes=service.name=my-java-app,deployment.environment=production ``` +- Service version resource attribute `service.version` to specify the version of your instrumented application. For + example: + ``` + -Dotel.resource.attributes=service.name=my-java-app,service.version=1.2.3 + ``` + +The `deployment.environment` and `service.version` resource attributes are not strictly required by the agent, but we +recommend setting them if they are available. The `otel.resource.attributes` syntax is described in detail in the [trace configuration](docs/advanced-config.md#trace-configuration) section. diff --git a/docs/advanced-config.md b/docs/advanced-config.md index 6ce1d1646..2568e3c6c 100644 --- a/docs/advanced-config.md +++ b/docs/advanced-config.md @@ -11,13 +11,13 @@ Below you will find all the configuration options supported by this distribution ## Splunk distribution configuration -| System property | Environment variable | Default value | Purpose | -| -------------------------------------- | -------------------------------------- | ------------------------------------ | ------- | -| `splunk.trace-response-header.enabled` | `SPLUNK_TRACE_RESPONSE_HEADER_ENABLED` | `true` | Enables adding server trace information to HTTP response headers. See [this document](server-trace-info.md) for more information. -| `splunk.access.token` | `SPLUNK_ACCESS_TOKEN` | unset | (Optional) Auth token allowing exporters to communicate directly with the Splunk cloud, passed as `X-SF-TOKEN` header. Currently only the [Jaeger span exporter](#jaeger-exporter) and [SignalFx metrics exporter](metrics.md) support this property. -| `splunk.metrics.enabled` | `SPLUNK_METRICS_ENABLED` | `true` | Enables exporting metrics. See [this document](metrics.md) for more information. -| `splunk.metrics.endpoint` | `SPLUNK_METRICS_ENDPOINT` | `http://localhost:9080/v2/datapoint` | The SignalFx metrics endpoint to connect to. -| `splunk.metrics.export.interval` | `SPLUNK_METRICS_EXPORT_INTERVAL` | `10000` | The interval between pushing metrics, in milliseconds. +| System property | Environment variable | Default value | Purpose | +| -------------------------------------- | -------------------------------------- | ----------------------- | ------- | +| `splunk.trace-response-header.enabled` | `SPLUNK_TRACE_RESPONSE_HEADER_ENABLED` | `true` | Enables adding server trace information to HTTP response headers. See [this document](server-trace-info.md) for more information. +| `splunk.access.token` | `SPLUNK_ACCESS_TOKEN` | unset | (Optional) Auth token allowing exporters to communicate directly with the Splunk cloud, passed as `X-SF-TOKEN` header. Currently the [both Jaeger and OTLP trace exporters](#trace-exporters) and [SignalFx metrics exporter](metrics.md) support this property. +| `splunk.metrics.enabled` | `SPLUNK_METRICS_ENABLED` | `true` | Enables exporting metrics. See [this document](metrics.md) for more information. +| `splunk.metrics.endpoint` | `SPLUNK_METRICS_ENDPOINT` | `http://localhost:9943` | The SignalFx metrics endpoint to connect to. +| `splunk.metrics.export.interval` | `SPLUNK_METRICS_EXPORT_INTERVAL` | `30000` | The interval between pushing metrics, in milliseconds. The SignalFx exporter can be configured to export metrics directly to Splunk ingest. To achieve that, you need to set the `splunk.access.token` configuration property @@ -28,22 +28,56 @@ export SPLUNK_ACCESS_TOKEN=my_splunk_token export SPLUNK_METRICS_ENDPOINT=https://ingest.us0.signalfx.com ``` -## Jaeger exporter +## Trace exporters | System property | Environment variable | Default value | Description | | ------------------------------- | --------------------------------- | -------------------------------- | ----------- | -| `otel.traces.exporter` | `OTEL_TRACES_EXPORTER` | `jaeger-thrift-splunk` | Select the span exporter to use. +| `otel.traces.exporter` | `OTEL_TRACES_EXPORTER` | `otlp` | Select the traces exporter to use. We recommend using either the OTLP exporter (`otlp`) or the Jaeger exporter (`jaeger-thrift-splunk`). +| `otel.exporter.otlp.endpoint` | `OTEL_EXPORTER_OTLP_ENDPOINT` | `http://localhost:4317` | The OTLP endpoint to connect to. | `otel.exporter.jaeger.endpoint` | `OTEL_EXPORTER_JAEGER_ENDPOINT` | `http://localhost:9080/v1/trace` | The Jaeger endpoint to connect to. -The Jaeger exporter can be configured to export traces directly to Splunk ingest. -To achieve that, you need to set the `splunk.access.token` configuration property -and set the `otel.exporter.jaeger.endpoint` to Splunk ingest URL. For example: +The Splunk Distribution of OpenTelemetry Java Instrumentation uses the OTLP traces exporter as the default setting. +Please note that the OTLP format is not supported by the (now +deprecated) [SignalFx Smart Agent](https://github.com/signalfx/signalfx-agent). If you wish to use the Jaeger exporter +instead, you can set it by using the `otel.traces.exporter` configuration option. For example: + +```bash +export OTEL_TRACES_EXPORTER=jaeger-thrift-splunk +``` + + +Both OTLP and Jaeger exporters can be configured to export traces directly to Splunk ingest. To achieve that, you need +to set the `splunk.access.token` configuration property and set the `otel.exporter.otlp.endpoint` ( +or `otel.exporter.jaeger.endpoint`) to Splunk ingest URL. + +OTLP example: ```bash export SPLUNK_ACCESS_TOKEN=my_splunk_token +export OTEL_EXPORTER_OTLP_ENDPOINT=https://ingest.us0.signalfx.com/v2/trace +``` + +Jaeger example: + +```bash +export SPLUNK_ACCESS_TOKEN=my_splunk_token +export OTEL_TRACES_EXPORTER=jaeger-thrift-splunk export OTEL_EXPORTER_JAEGER_ENDPOINT=https://ingest.us0.signalfx.com/v2/trace ``` +## Trace propagation configuration + +| System property | Environment variable | Default value | Description | +| ------------------ | -------------------- | -------------------------------- | ----------- | +| `otel.propagators` | `OTEL_PROPAGATORS` | `tracecontext,baggage` | A comma-separated list of propagators that will be used. You can find the list of supported propagators [here](https://github.com/open-telemetry/opentelemetry-java/tree/main/sdk-extensions/autoconfigure#propagator). + +If you wish to be compatible with older versions of the Splunk Distribution of OpenTelemetry Java Instrumentation +(or the SignalFx Tracing Java Agent) you can set the trace propagator to B3: + +```bash +export OTEL_PROPAGATORS=b3multi +``` + ## Trace configuration | System property | Environment variable | Default value | Purpose | @@ -62,14 +96,6 @@ export OTEL_EXPORTER_JAEGER_ENDPOINT=https://ingest.us0.signalfx.com/v2/trace | ------------------------ | ------------------------ | -------------- | -------------------------------------------------| | `otel.javaagent.enabled` | `OTEL_JAVAAGENT_ENABLED` | `true` | Globally enables javaagent auto-instrumentation. | -## Deprecated configuration - -These configuration options will be removed in the future; if you're still using one of them please migrate! - -| Deprecated configuration option | Replacement | Migration instructions | -| -------------------------------------- | -------------------------------------- | ---------------------- | -| `splunk.context.server-timing.enabled` | `splunk.trace-response-header.enabled` | The old property was renamed, the value and the way it works is exactly the same as it had been before. - ## Other OpenTelemetry Java agent configuration You can find all other Java agent configuration options diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 074b71ed5..4539e091e 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -33,6 +33,30 @@ If you find that any instrumentation is broken please do not hesitate to [file a ## Trace exporter issues +If you're unsure which trace exporter you are using, most likely it's the OTLP exporter - it's the default trace +exporter in the Splunk Distribution of OpenTelemetry Java Instrumentation. + +### OTLP exporter + +If you're seeing the following error in your logs: + +``` +[BatchSpanProcessor_WorkerThread-1] ERROR io.opentelemetry.exporter.otlp.trace.OtlpGrpcSpanExporter - Failed to export spans. Server is UNAVAILABLE. Make sure your collector is running and reachable from this network. Full error message:UNAVAILABLE: io exception +``` + +then it means that the javaagent cannot send trace data to the OpenTelemetry Collector or Splunk cloud. + +1. Please make sure that `otel.exporter.otlp.endpoint` points to the correct host: an OpenTelemetry Collector instance + or the Splunk ingest URL. +2. If you're using the OpenTelemetry Collector, verify that the instance is up. +3. Please make sure that your OpenTelemetry Collector instance is properly configured and that the OTLP gRPC receiver is + enabled and plugged into the traces pipeline. +4. The OpenTelemetry Collector listens on the following address: `http://:4317`. Verify that your URL is correct. +5. If you're sending traces directly to the Splunk cloud, please verify that the `SPLUNK_ACCESS_TOKEN` is configured and + contains a valid access token. + +### Jaeger exporter + If you're seeing the following warnings in your logs: ``` @@ -47,9 +71,7 @@ Caused by: java.net.ConnectException: Connection refused (Connection refused) ... ``` -then it means that the javaagent cannot send trace data to the Smart Agent/Collector/Splunk backend. - -Assuming you're using the default Jaeger Thrift exporter: +then it means that the javaagent cannot send trace data to the Smart Agent/OpenTelemetry Collector/Splunk cloud. 1. Please make sure that `otel.exporter.jaeger.endpoint` points to the correct host: a Smart Agent or OpenTelemetry Collector instance, or the Splunk ingest URL. @@ -60,6 +82,18 @@ Assuming you're using the default Jaeger Thrift exporter: for the Jaeger receiver: the Agent uses `http://:9080/v1/trace` and the Collector uses `http://:14268/api/traces`. Verify that your URL is correct. +If you're sending spans directly to the Splunk cloud and getting the following errors: + +``` +[BatchSpanProcessor_WorkerThread-1] WARN io.opentelemetry.exporter.jaeger.thrift.JaegerThriftSpanExporter - Failed to export spans +io.jaegertracing.internal.exceptions.SenderException: Could not send 40 spans, response 401: Unauthorized + at io.jaegertracing.thrift.internal.senders.HttpSender.send(HttpSender.java:86) + ... +``` + +then it means that your `SPLUNK_ACCESS_TOKEN` setting is either missing or invalid. Please make sure that you use a +valid Splunk access token when sending telemetry directly to the Splunk cloud. + ## Metrics exporter issues If you see the following warning: @@ -79,6 +113,8 @@ or the Splunk backend. 4. Smart Agent and OpenTelemetry Collector by default use different ports for the SignalFx receiver: the Agent uses `http://:9080/v2/datapoint` and the Collector uses `http://:9943`. Verify that your URL is correct. -5. Metrics feature is still experimental - if you can't make it work or encounter +5. If you're sending metrics directly to the Splunk cloud, please verify that the `SPLUNK_ACCESS_TOKEN` is configured + and contains a valid access token. +6. Metrics feature is still experimental - if you can't make it work or encounter any unexpected issues you can [turn it off](advanced-config.md#splunk-distribution-configuration) and [file a bug](https://github.com/signalfx/splunk-otel-java/issues/new).