Improve use of tracing spans in query path #25911

hiltontj · 2025-01-24T19:17:24Z

The purpose of this issue is to get a better understanding of how influxdb3 uses trace spans in the query path.

To better help diagnose and troubleshoot latency and performance issues in the query path, we may need to better leverage tracing spans in the query path, as what we have seen in results from the perf team, the information we are generating is sparse.

Objectives of this issue

analyze generated trace spans during query execution to see what information is being generated
determine which is being generated from iox_query and which is generated from this codebase
determine if there are any obvious gaps in that information
propose follow-up work

The text was updated successfully, but these errors were encountered:

MaduMitha-Ravi · 2025-01-24T19:29:11Z

@hiltontj Here are the trace spans that are collected for IOx/InfluxDB 3 projects. These trace spans are customizable and they can be added/removed/updated/renamed per need. The IOx team at times for their works/projects create new or sub spans that can be analyzed for performance.

Note: Below is the exhaustive list, and most of these traces are not captured.

IOx querier (ms)
execute_stream_partitioned (ms)
querier_table (ms)
sql_to_logical_plan (ms)
IOx ingester (ms)
table_snapshot (ms)
partition_snapshot (ms)
initial_request (ms)
stream_data_from_ingester (ms)
querier_flight_request_to_one_ingester (ms)
querier_table_chunks (ms)
querier_ingesters (ms)
prune_partitions (ms)
get_namespace (ms)
flight_planner (ms)
planner (ms)
get_metadata (ms)
loader (ms)
load_page_index (ms)
load_metadata (ms)
cache_metadata (ms)
prune_files (ms)
ParquetExec (ms)
sql_to_physical_plan (ms)
query_planning (ms)
acquire (ms)
permit (ms)
create_physical_plan (ms)

hiltontj · 2025-01-27T16:13:48Z

Note: Below is the exhaustive list, and most of these traces are not captured.

Thanks @MaduMitha-Ravi - just to be clear, by "most of these traces are not captured", do you mean that monolith is not capturing them, while IOx is capturing them?

...or do you mean that they are not captured in general by both monolith and IOx?

MaduMitha-Ravi · 2025-01-27T17:31:49Z

@hiltontj Yes, most of those trace spans are captured in IOx but not with Monolith. I will try to get more details today.

hiltontj · 2025-01-27T17:44:21Z

Got it - that is very helpful. I can look into how IOx uses those spans and see how we can replicate. I think that would be a good starting point to help surface useful information in the tracing for your team.

Beyond that, we can start adding monolith-specific traces.

MaduMitha-Ravi · 2025-01-28T01:51:22Z

Here is the sample from IOx from a latest run that returns data,

Measurement	Query Name	IOx querier (ms)	execute_stream_partitioned (ms)	querier_table (ms)	sql_to_logical_plan (ms)	IOx ingester (ms)	table_snapshot (ms)	partition_snapshot (ms)	initial_request (ms)	stream_data_from_ingester (ms)	querier_flight_request_to_one_ingester (ms)	querier_table_chunks (ms)	querier_ingesters (ms)	prune_partitions (ms)	get_namespace (ms)	flight_planner (ms)	ParquetExec (ms)	sql_to_physical_plan (ms)	query_planning (ms)	acquire (ms)	permit (ms)	create_physical_plan (ms)
AVG	s-q1	85.183	60.245	1.058	1.414	7.559	1.011	0.790	5.252	11.278	16.584	18.905	16.640	0.101	1.407	21.808	58.870	21.953	82.721	0.005	60.197	20.454

MaduMitha-Ravi · 2025-02-10T17:21:15Z

@hiltontj I do understand you have so many things that needs your focus, but wondering if there is an update on the Traces.

waynr · 2025-02-14T01:34:26Z

@pauldix I've got to the point where I roughly understand how tracing spans work in IOx & monolith and I've got a setup where I can send traces to jaeger in OSS (see #26006) but with respect to your comment in Slack:

Mostly, we need to know from the traces when queries are pulling Parquet files from cache or from object store. Then there's probably a bunch of other stuff, but I think that's the most important.

I think this might be tough to do if you want the cache/object store spans to inherit the parent span of the calling context since all object store access and caching is mediated through the ObjectStore trait, for example:

https://github.com/influxdata/influxdb_pro/blob/1efb2c8c6d96d37202a7c8c8f7134fd705cb249d/influxdb3_cache/src/parquet_cache/mod.rs#L700-L712

The reason this makes things difficult is that we need to be able to pass either a Span (https://github.com/influxdata/influxdb_iox/blob/7419ce7dc877d9073344e6c77221780142753322/trace/src/span.rs#L54) or a SpanContext (https://github.com/influxdata/influxdb_iox/blob/7419ce7dc877d9073344e6c77221780142753322/trace/src/ctx.rs#L44) from the caller but the ObjectStore trait methods are more or less fixed as they are (ie they don't take a Span or SpanContext).

I looked through the memory cache implementation in IOx and saw a similar lack of trace span reporting. Instead they seem to rely simply on reporting cache hit/miss/error metrics:

https://github.com/influxdata/influxdb_iox/blob/7419ce7dc877d9073344e6c77221780142753322/object_store_mem_cache/src/store.rs#L157-L161

There are two approaches i can think of to work around this:

Separate Span Hierarchy for cached ObjectStore implementation

With this approach we would create a separate span hierarchy just for object store / cache accesses when initializing the memory cache object store. The main problem with this is that when we are troubleshooting a problematic query span there will be nothing (as far as i can tell) that would tie the objectstore/cache access to the problematic query.

Use `object_store::Attribute` passed to `ObjectStore.*_opt` methods

With this approach, we would initialize the MemCachedObjectStore with a telemetry collector, then use that and a parent span ID passed in to the ObjectStore.*_opt methods (eg get_opt) using a Attribute::Metadata. This would require the following work:

Replace all relevant ObjectStore method usage through in the query call stack with the *_opt variants
- I'm guessing this would mostly be get_opt, get_range, and get_ranges methods
Update those methods on MemCachedObjectStore to actually try hitting the cache and to also construct a SpanContext using the MemCachedObjectStore's trace collector and a parent span ID obtained from Attribute::Metadata and build a span reporter from that.

I'll tentatively start with the second approach to see how feasible it is but keep an eye on this issue for feedback.

pauldix · 2025-02-14T14:32:48Z

Any chance we can update the object store trait to take an optional span? Seems like that's what should be happening. Tracking metrics seems less than ideal if we can actually get tracing on it.

waynr · 2025-02-14T17:45:32Z

Any chance we can update the object store trait to take an optional span?

Maybe if we made the changes purely additive (ie new methods on the trait with default no-op/error implementations to avoid forcing other implementers of the trait to make supporting changes), but even if we could get buy-in from maintainers (many of whom work at InfluxData as well, so that could happen) there is the problem that the tracing spans we are talking about are very bespoke to InfluxData projects: https://github.com/influxdata/influxdb3_core/tree/main/trace

I think this is a problem because the object_store crate where the trait is defined is relatively independent from any other InfluxData crates. We might have to do something with the trace, trace_http, and trace_exporters crates like relocate them to be maintained alongside the object store repo: https://github.com/apache/arrow-rs

I think the simplest approach is to use the *_opt methods like I described in my previous comment -- I'll try to get a PR up demonstrating that soon.

waynr · 2025-02-14T21:05:10Z

So I just realized it's likely that we're passing these Arc<dyn ObjectStore> instances around to other libraries that we don't directly control in this repo -- I see that happening when initializing the parquet_file::ParquetStorage that's passed to the iox_query::exec::Executor constructor. Because of this, I don't think it will be as simple as replacing calls to ObjectStore.get with ObjectStore.get_opts like I described.

Metrics might be the best thing we can get without a major refactor spanning several repos.

waynr · 2025-02-14T22:20:38Z

Okay I spent some time looking around IOx code to try to understand how spans like prune_partitions are recorded under the hood of a trait object -- in that case it's an impl of TableProvider. I was looking at this to try to understand whether that particular span has any analogs here in monolith and realized it is using a type system trick to store arbitrary data (like Spans) from the calling context.

Specifically, the more interesting TableProvider methods (like scan) take a Session trait object which provides access to a SessionConfig instance that have support for custom configuration. This custom configuration allows the calling context of TableProvider trait objects to pass in arbitrary 'static lifetime values that can be retrieve internally by the implementation.

A similar approach could work for the object_store::ObjectStore trait. We would have to do roughly the following:

introduce a new type, object_store::ObjectStoreSession that has a dynamically-typed map of type IDs to Arc<dyn Any + Send + Sync + 'static> that we would downcast to a concrete type during retrieval
introduce a new set of optional methods like get_with_session to the ObjectStore trait
- these methods would be optional by virtue of default implementations that ignore the session and delegate to the corresponding required methods
refactor code in https://github.com/influxdata/influxdb3_core to prefer the get_with_session wherever the calling context has a span that can be passed in
implement get_with_session in MemCachedObjectStore to record different spans for each branch of the code (ie retrieving from memory cache vs object store)

waynr · 2025-02-14T23:06:49Z

I think the simplest approach is to use the *_opt methods like I described in my previous comment -- I'll try to get a PR up demonstrating that soon.

By the way, I realized the *_opt approach wouldn't work in this case because https://docs.rs/object_store/latest/object_store/struct.GetOptions.html doesn't actually accept the same kind of arbitrary user metadata as https://docs.rs/object_store/latest/object_store/struct.PutOptions.html

MaduMitha-Ravi · 2025-03-03T21:24:21Z

@waynr Do we have an update on the Trace spans work?

waynr · 2025-03-03T23:03:07Z

@MaduMitha-Ravi I just merged this PR which adds tracing spans to the compactor: https://github.com/influxdata/influxdb_pro/pull/538

The object_store crate also has a new extensions feature that will let us add properly-parented spans to queries that access the object store -- this will let us discern whether a query involves object store accesses over the network or the in-memory parquet file cache is used instead.

waynr · 2025-03-03T23:03:53Z

To be clear with respect to the object_store work, I still need to update a datafusion PR to pass on extensions from the session config to the object store API calls that it makes (see apache/datafusion#14804)

hiltontj added the v3 label Jan 24, 2025

waynr self-assigned this Feb 12, 2025

waynr mentioned this issue Feb 13, 2025

feat: add tracing to query path #26006

Merged

waynr mentioned this issue Feb 14, 2025

Introduce ObjectStore methods that take Session data apache/arrow-rs#7135

Closed

Xuanwo mentioned this issue Feb 24, 2025

discuss: Introduce datafusion-storage as datafusion's own storage interface apache/datafusion#14854

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve use of tracing spans in query path #25911

Improve use of tracing spans in query path #25911

hiltontj commented Jan 24, 2025

MaduMitha-Ravi commented Jan 24, 2025

Uh oh!

hiltontj commented Jan 27, 2025

Uh oh!

MaduMitha-Ravi commented Jan 27, 2025 •

edited

Loading

Uh oh!

hiltontj commented Jan 27, 2025

Uh oh!

MaduMitha-Ravi commented Jan 28, 2025

Uh oh!

MaduMitha-Ravi commented Feb 10, 2025

Uh oh!

waynr commented Feb 14, 2025

Uh oh!

pauldix commented Feb 14, 2025

Uh oh!

waynr commented Feb 14, 2025

Uh oh!

waynr commented Feb 14, 2025

Uh oh!

waynr commented Feb 14, 2025 •

edited

Loading

Uh oh!

waynr commented Feb 14, 2025

Uh oh!

MaduMitha-Ravi commented Mar 3, 2025

Uh oh!

waynr commented Mar 3, 2025

Uh oh!

waynr commented Mar 3, 2025

Uh oh!

Improve use of tracing spans in query path #25911

Improve use of tracing spans in query path #25911

Comments

hiltontj commented Jan 24, 2025

Objectives of this issue

MaduMitha-Ravi commented Jan 24, 2025

Uh oh!

hiltontj commented Jan 27, 2025

Uh oh!

MaduMitha-Ravi commented Jan 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hiltontj commented Jan 27, 2025

Uh oh!

MaduMitha-Ravi commented Jan 28, 2025

Uh oh!

MaduMitha-Ravi commented Feb 10, 2025

Uh oh!

waynr commented Feb 14, 2025

Separate Span Hierarchy for cached ObjectStore implementation

Use object_store::Attribute passed to ObjectStore.*_opt methods

Uh oh!

pauldix commented Feb 14, 2025

Uh oh!

waynr commented Feb 14, 2025

Uh oh!

waynr commented Feb 14, 2025

Uh oh!

waynr commented Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

waynr commented Feb 14, 2025

Uh oh!

MaduMitha-Ravi commented Mar 3, 2025

Uh oh!

waynr commented Mar 3, 2025

Uh oh!

waynr commented Mar 3, 2025

Uh oh!

MaduMitha-Ravi commented Jan 27, 2025 •

edited

Loading

Use `object_store::Attribute` passed to `ObjectStore.*_opt` methods

waynr commented Feb 14, 2025 •

edited

Loading