WC-2707 Only submit non-empty traces to trace workers #2946
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, any invocation for a stage generates a trace, even for PipelineLogLevel::NONE. The only difference is that NONE does not attach logs, exceptions, or eventInfo to the trace.
This leads us to see traces with a null event for stages you opted to not get traces from, which we often see when there's a FPW in a pipeline that we did not intend to see.
There are quite a few places throughout this and edgeworker that note that we do this to "keep the request tree contiguous", but from my investigation and chats with other people, these traces are not used for anything other than delivery to trace/tail handlers, i.e. this isn't used to report metrics or billing. So there's no reason to filter these out before delivering traces.
This is of course not as efficient as it could be, as we are allocating for empty traces and then discarding them before submitting. A more comprehensive approach would be to not generate traces at all when the stage / subpipeline's log level is NONE, but that's probably beyond me currently (although I would welcome the help).
I do have a tests that are tweaked to reflect these changes in edgeworker, but there seem to be some quirks / nuances I don't understand there (for example, the tests pass normally but do not pass for process sandboxing) so I was recommended to get a PR up to workerd to begin the discussion here.