[breaking] Reduce allocations in `LineToEvents` by returning a single event instead of an slice #577

pedro-stanaka · 2024-09-12T20:49:47Z

Description

The current implementation of the LineToEvents method in pkg/line/line.go returns multiple events, which results in a lot of allocations in the statsd-exporter. We propose changing this method to return a single event instead.
On the same refactoring step, we should introduce a "exploder" or "sampling" layer (or helper method on the event) which will keep the same behavior as we have now.
This is a breaking change on purpose which will force client projects using this repository as a library, to adapt to the new method signature.

Here is an example profile with samples using sampling rate, where you can see it takes more than 21% of total memory used:

Goals

Reduce Allocations: By returning a single event, we aim to reduce the number of allocations and improve performance.
Enforce Build Breaks: Changing the method signature will force dependent projects to update their code, ensuring compatibility with the new implementation.
Introduce New Layer: Push the decision about what to do with Count or SamplingRate to another layer of the exporter.

Proposed Solution

Change Method Signature: Modify the LineToEvents method to return a single event.
Update Event Structure: Include Count or SamplingRate in the event structure.
Introduce New Layer: Implement a new layer in the exporter to handle the decision-making process for Count or SamplingRate.

Repro script

	dogClient, err := statsd.New(
		opts.statsdServer,
		statsd.WithoutTelemetry(),
		statsd.WithNamespace("flood_statsd"),
		statsd.WithTags([]string{"pod_name:" + os.Getenv("POD_NAME")}),
		statsd.WithoutClientSideAggregation(),
	)
	if err != nil {
		level.Error(logger).Log("msg", "Error creating dogClient client", "err", err)
		os.Exit(1)
	}

	samplesSent := 0

	start := time.Now()
	for {
		randomInt := rand.Int63n(opts.cardinalityLimit)
		err := dogClient.Count("sample_counter", 1, []string{"mybad_label:co_" + strconv.FormatInt(randomInt, 10)}, 0.01)
		if err != nil {
			level.Error(logger).Log("msg", "Error sending metric", "err", err)
		}
		_ = dogClient.Distribution("synthetic", float64(randomInt), []string{"mybad_label:co" + strconv.FormatInt(randomInt, 10)}, 0.001)
		_ = dogClient.Timing(
			"some_timing",
			time.Duration(rand.Intn(3000))*time.Millisecond,
			[]string{"cardinality:" + strconv.FormatInt(randomInt, 10)},
			0.01,
		)
		_ = dogClient.Gauge("some_gauge", float64(randomInt), []string{"mybad_label:co" + strconv.FormatInt(randomInt, 10)}, 0.01)

		_ = dogClient.Distribution("heavily_sampled_distribution", float64(randomInt), []string{"mybad_label:co" + strconv.FormatInt(randomInt, 10)}, 0.001)

		// control the rate of the synthetic metrics
		samplesSent += 4
		if samplesSent >= int(opts.samplesPerSec) {
			elapsed := time.Since(start)
			if elapsed < time.Second {
				level.Info(logger).Log("msg", "Sleeping for", "duration", time.Second-elapsed)
				time.Sleep(time.Second - elapsed)
			}
			level.Info(logger).Log("msg", "Sending", "samples", samplesSent, "elapsed", elapsed)
			samplesSent = 0
			start = time.Now()
		}
	}

The text was updated successfully, but these errors were encountered:

matthiasr · 2024-09-15T09:12:05Z

Thank you! This topic also intersects with #558, where we unpack dogstatsd extended aggregations into individual events. Instead, it might be better to keep the values as a list, without duplicating the event metadata around them.

I want this to be an explicitly breaking change, because otherwise adding a sampling rate field would be a silently breaking change, where downstream processing of the events doesn't obviously break, but produces incorrect results by not taking the sampling rate into account

…lementation This PR introduces the first step in refactoring the event handling system to better support multiple values in a single event, which will help reduce allocations when processing events. This is part of a larger effort to improve performance and reduce memory allocations in the statsd exporter. Changes: - Add new `MultiValueEvent` interface that supports multiple values per event - Add `MultiObserverEvent` implementation for handling multiple observations - Add `ExplodableEvent` interface for backward compatibility - Add `Values()` method to existing event types - Add comprehensive tests for new interfaces and implementations This change is the foundation for future improvements that will: 1. Move explosion logic to a dedicated package 2. Update the line parser to use multi-value events 3. Modify the exporter to handle multi-value events directly 4. Eventually remove the need for event explosion The changes in this PR are backward compatible and don't affect existing functionality. Relates to #577 Signed-off-by: Pedro Tanaka <[email protected]>

matthiasr added enhancement library Issues pertaining to the use of the packages as a library more than the exporter itself labels Sep 15, 2024

pedro-stanaka changed the title ~~Reduce allocations in LineToEvents by returning a single event instead of an slice~~ [breaking] Reduce allocations in LineToEvents by returning a single event instead of an slice Nov 11, 2024

pedro-stanaka mentioned this issue Dec 17, 2024

feat(event): Add MultiValueEvent interface and MultiObserverEvent implementation #602

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[breaking] Reduce allocations in `LineToEvents` by returning a single event instead of an slice #577

[breaking] Reduce allocations in `LineToEvents` by returning a single event instead of an slice #577

pedro-stanaka commented Sep 12, 2024 •

edited

Loading

matthiasr commented Sep 15, 2024

[breaking] Reduce allocations in LineToEvents by returning a single event instead of an slice #577

[breaking] Reduce allocations in LineToEvents by returning a single event instead of an slice #577

Comments

pedro-stanaka commented Sep 12, 2024 • edited Loading

Description

Goals

Proposed Solution

Repro script

matthiasr commented Sep 15, 2024

[breaking] Reduce allocations in `LineToEvents` by returning a single event instead of an slice #577

[breaking] Reduce allocations in `LineToEvents` by returning a single event instead of an slice #577

pedro-stanaka commented Sep 12, 2024 •

edited

Loading