Ability to specify the purpose/category/function parts of a service #1841

JoshuaWGreen · 2025-01-28T15:33:43Z

Area(s)

area:service

What's missing?

Background
Hosts are often dedicated to a particular purpose. For example you may have some hosts that provide the database layer, others provide web services, others provide distributed compute, others serve content (CDNs) etc.

Scenario: Slicing metrics
Given some trace derived metrics: slice the metrics by the part of the stack that generated them. One could separate metrics related to the application layer from the database layer or services.

Scenario: Reducing Cardinality
With applications running in the cloud and scaling elastically, host.name starts to become a highly variable property. While an a static host running 50 services may emit metrics with a cardinality of service.name * host.name = 50, an elastic cluster frequently regenerated may produce a service.name * host.name of thousands over the period of a week (ignoring all the other dimensions like span.name and different types of metrics). This is even more acute for compute clusters with many thousands of elastic nodes. Time series databases holding historic data may struggle with the high cardinality this produces. A work-around is to store metrics with a lower cardinality (but still meaningful) host identifier.

Existing options
The existing semconv define host.type. However for cloud, this is supposed to be the machine type. While machine types are often related to the purpose of a host, they are much more fine grained. There could be a lot of types that could be used for (say) compute, the names vary per provider and could expand over time. In addition, a machine type can be used for multiple purposes.

Describe the solution you'd like

Define a host.purpose attribute. This could have values like database, application, service, compute etc.

There are several words that could be used. Category feels too vague. Function is an overloaded word in software engineering so maybe better to avoid it in this context. It may be that placing this under host is not appropriate (after all, you could perform all parts of the stack on a single host) but I haven't found a better namespace so far.

While examples should be given, I don't think we should be too prescriptive about the granularity of the purpose. For example, some may be happy with compute hosts, while others may need to be more granular (e.g. weather.compute, ai.compute", physics.compute for hosts optimizes for different types of compute)

The text was updated successfully, but these errors were encountered:

trask · 2025-01-28T23:08:31Z

cc @open-telemetry/semconv-system-approvers

braydonk · 2025-01-29T14:36:57Z

My immediate thoughts are that the usecases are valid and it would be good to come up with a way to address them, but at the same time I'm not sure how to fit this into host.

All the attributes in host right now are objective; they are reflections of objective information about the host with clear instrumentation instructions. Something like host.purpose is subjective; it's the user's opinion for what the actual values should be, as it completely depends on the architecture of their own system. From the System Semconv group perspective, I'm not sure how to handle this kind of attribute.

I'll lean on @open-telemetry/specs-semconv-maintainers for this one, maybe there's prior art on this sort of thing we can refer to.

trask · 2025-01-30T03:06:48Z

it sounds maybe similar to #575?

JoshuaWGreen · 2025-01-31T10:13:37Z

@braydonk yes, host may not be the best domain for this idea.

@trask I think #575 discusses a new attribute at the service level. That proposed property would be constant for a given instance of a service. #575 trying to classify of the service itself (i.e. one deployment of service with service.name=abc has a service.(type|role)=agent; but this other deployment of service with service.name=abc has service.(type|role)=collector).

This issue covers the different parts/components of a service. So one instance of the service will have exactly one service.name but different parts of the trace emitted for a service would emit different values for purpose.

Reading #575 make me think this idea fits more in the service domain than the host domain. Maybe service.component or service.part is more appropriate. I'll change the title to remove references to host

So a service with service.name=weather.prediction may emit spans with service.component=server for the endpoint that receives client requests and orchestrates a response; service.component=database for spans emitted from the ORM (or database if the trace is propagated to the database which itself emits spans); service.component=compute when performing large compute operations. You'd see spans with all of these different service.component values in a single trace and it would allow you to isolate different parts of the trace (or metrics generated from traces) into the components that did different bits of work.

joaopgrassi · 2025-02-05T15:23:04Z

This seems like something close related to entities. CC @open-telemetry/entities-maintainers

github-actions bot added triage:needs-triage area:host labels Jan 28, 2025

github-project-automation bot added this to DRAFT - SemConv Issue Triage Jan 28, 2025

github-project-automation bot moved this to Need triage in DRAFT - SemConv Issue Triage Jan 28, 2025

JoshuaWGreen changed the title ~~Ability to specify the purpose/category/function of a host~~ Ability to specify the purpose/category/function parts of a service Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to specify the purpose/category/function parts of a service #1841

Ability to specify the purpose/category/function parts of a service #1841

JoshuaWGreen commented Jan 28, 2025 •

edited

Loading

trask commented Jan 28, 2025

braydonk commented Jan 29, 2025

trask commented Jan 30, 2025

JoshuaWGreen commented Jan 31, 2025

joaopgrassi commented Feb 5, 2025

Ability to specify the purpose/category/function parts of a service #1841

Ability to specify the purpose/category/function parts of a service #1841

Comments

JoshuaWGreen commented Jan 28, 2025 • edited Loading

Area(s)

What's missing?

Describe the solution you'd like

trask commented Jan 28, 2025

braydonk commented Jan 29, 2025

trask commented Jan 30, 2025

JoshuaWGreen commented Jan 31, 2025

joaopgrassi commented Feb 5, 2025

JoshuaWGreen commented Jan 28, 2025 •

edited

Loading