Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations in LineToEvents by returning a single event instead of an slice #577

Open
pedro-stanaka opened this issue Sep 12, 2024 · 1 comment
Labels
enhancement library Issues pertaining to the use of the packages as a library more than the exporter itself

Comments

@pedro-stanaka
Copy link
Contributor

pedro-stanaka commented Sep 12, 2024

Description

The current implementation of the LineToEvents method in pkg/line/line.go returns multiple events, which results in a lot of allocations in the statsd-exporter. We propose changing this method to return a single event instead.
On the same refactoring step, we should introduce a "exploder" or "sampling" layer (or helper method on the event) which will keep the same behavior as we have now.
This is a breaking change on purpose which will force client projects using this repository as a library, to adapt to the new method signature.

Here is an example profile with samples using sampling rate, where you can see it takes more than 21% of total memory used:
image

Goals

  1. Reduce Allocations: By returning a single event, we aim to reduce the number of allocations and improve performance.
  2. Enforce Build Breaks: Changing the method signature will force dependent projects to update their code, ensuring compatibility with the new implementation.
  3. Introduce New Layer: Push the decision about what to do with Count or SamplingRate to another layer of the exporter.

Proposed Solution

  1. Change Method Signature: Modify the LineToEvents method to return a single event.
  2. Update Event Structure: Include Count or SamplingRate in the event structure.
  3. Introduce New Layer: Implement a new layer in the exporter to handle the decision-making process for Count or SamplingRate.

Repro script

	dogClient, err := statsd.New(
		opts.statsdServer,
		statsd.WithoutTelemetry(),
		statsd.WithNamespace("flood_statsd"),
		statsd.WithTags([]string{"pod_name:" + os.Getenv("POD_NAME")}),
		statsd.WithoutClientSideAggregation(),
	)
	if err != nil {
		level.Error(logger).Log("msg", "Error creating dogClient client", "err", err)
		os.Exit(1)
	}

	samplesSent := 0

	start := time.Now()
	for {
		randomInt := rand.Int63n(opts.cardinalityLimit)
		err := dogClient.Count("sample_counter", 1, []string{"mybad_label:co_" + strconv.FormatInt(randomInt, 10)}, 0.01)
		if err != nil {
			level.Error(logger).Log("msg", "Error sending metric", "err", err)
		}
		_ = dogClient.Distribution("synthetic", float64(randomInt), []string{"mybad_label:co" + strconv.FormatInt(randomInt, 10)}, 0.001)
		_ = dogClient.Timing(
			"some_timing",
			time.Duration(rand.Intn(3000))*time.Millisecond,
			[]string{"cardinality:" + strconv.FormatInt(randomInt, 10)},
			0.01,
		)
		_ = dogClient.Gauge("some_gauge", float64(randomInt), []string{"mybad_label:co" + strconv.FormatInt(randomInt, 10)}, 0.01)

		_ = dogClient.Distribution("heavily_sampled_distribution", float64(randomInt), []string{"mybad_label:co" + strconv.FormatInt(randomInt, 10)}, 0.001)

		// control the rate of the synthetic metrics
		samplesSent += 4
		if samplesSent >= int(opts.samplesPerSec) {
			elapsed := time.Since(start)
			if elapsed < time.Second {
				level.Info(logger).Log("msg", "Sleeping for", "duration", time.Second-elapsed)
				time.Sleep(time.Second - elapsed)
			}
			level.Info(logger).Log("msg", "Sending", "samples", samplesSent, "elapsed", elapsed)
			samplesSent = 0
			start = time.Now()
		}
	}
@matthiasr matthiasr added enhancement library Issues pertaining to the use of the packages as a library more than the exporter itself labels Sep 15, 2024
@matthiasr
Copy link
Contributor

Thank you! This topic also intersects with #558, where we unpack dogstatsd extended aggregations into individual events. Instead, it might be better to keep the values as a list, without duplicating the event metadata around them.

I want this to be an explicitly breaking change, because otherwise adding a sampling rate field would be a silently breaking change, where downstream processing of the events doesn't obviously break, but produces incorrect results by not taking the sampling rate into account

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement library Issues pertaining to the use of the packages as a library more than the exporter itself
Projects
None yet
Development

No branches or pull requests

2 participants