Replace container-stats with golang client #2188

erthalion · 2025-06-20T14:30:48Z

Description

Currently we use a separate container to capture Collector container
performance stats. This container mounts docker/podman socket and uses
corresponding cli to get the data. But in some cases (e.g. SUSE) we can't mount
the socket, hence the approach doesn't work.

Use CRI/Docker API clients to get container stats information instead. This is
more direct approach, which allows to trim number of containers and overall
tests churn. The stats capturing is implemented as a cancelable background go
routine, which is started/stopped accordingly to start/stop of the Collector
container.

One side effect of this approach is different reporting units. Previously we
were reporting data in human readable form (megabytes or cpu utilization
percentage), now it's going to be more prometheus style -- bytes and millicores
used. Since the idea is to compare stats before different commits, I don't
think it's worth the effort to convert the numbers to keep the original format.

Note that docker go package bump is needed to use some functionality.

On top of that fix small anomaly -- ProcfsScraper test was using executor
object directly, instead of going through a getter.

Checklist

Investigated and inspected CI test results
~~- [ ] Updated documentation accordingly~~

Automated testing
~~- [ ] Added unit tests~~
~~- [ ] Added integration tests~~
~~- [ ] Added regression tests~~

CI is sufficient.

Testing Performed

Running TestCollectorStartup and verifying stats.

codecov-commenter · 2025-06-20T14:39:49Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 28.83%. Comparing base (614e6c5) to head (6d7e570).
Report is 3 commits behind head on master.

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #2188   +/-   ##
=======================================
  Coverage   28.83%   28.83%           
=======================================
  Files          96       96           
  Lines        5799     5799           
  Branches     2551     2551           
=======================================
  Hits         1672     1672           
  Misses       3408     3408           
  Partials      719      719

Flag	Coverage Δ
collector-unit-tests	`28.83% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Use CRI/Docker API clients to get container stats information, instead of spinnig up a new container (which can fail to mount docker socket and thus provide no information).

erthalion · 2025-06-24T07:26:10Z

/test

sourcery-ai

Hey @erthalion - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

Molter73

Left a few comments, I'm a bit worried about the synchronization aspect of the array of stats, though I might be worrying too much.

Molter73 · 2025-06-24T13:20:27Z

integration-tests/pkg/executor/executor_cri.go

+	var stats *pb.ContainerStats
+	ctx := context.Background()
+
+	stats, err := c.runtimeService.ContainerStats(ctx, containerID)


[nit] This can be a little less verbose

Suggested change

var stats *pb.ContainerStats

ctx := context.Background()

stats, err := c.runtimeService.ContainerStats(ctx, containerID)

stats, err := c.runtimeService.ContainerStats(context.Background(), containerID)

Molter73 · 2025-06-24T13:26:14Z

integration-tests/pkg/executor/executor_docker_api.go

+	var stats container.StatsResponse
+	ctx := context.Background()
+
+	statsResp, err := d.client.ContainerStatsOneShot(ctx, containerID)


[nit] Similar to the last comment

Suggested change

var stats container.StatsResponse

ctx := context.Background()

statsResp, err := d.client.ContainerStatsOneShot(ctx, containerID)

var stats container.StatsResponse

statsResp, err := d.client.ContainerStatsOneShot(context.Background(), containerID)

Molter73 · 2025-06-24T13:26:55Z

integration-tests/pkg/executor/executor_docker_api.go

+func (d *dockerAPIExecutor) GetContainerStats(containerID string) (
+	*ContainerStat, error) {
+
+	var stats container.StatsResponse


[nit] We can probably move this line closer to the first use of stats on line 107.

Molter73 · 2025-06-24T13:31:45Z

integration-tests/pkg/executor/executor_docker_api.go

+	var stats container.StatsResponse
+	ctx := context.Background()
+
+	statsResp, err := d.client.ContainerStatsOneShot(ctx, containerID)


Does this ContainerStatsOneShot return a snapshot of the usage at the time of the call or does it do some sort of aggregation over time? Is the behavior consistent with the CRI counterpart?

I tried to figure it out from reading the docs but I couldn't find an answer, I can dig a little more later.

The documentation is sparse, but I guess it's the same one-shot as in the API, which just eliminates waiting and returns empty per_cpu:
https://docs.docker.com/reference/api/engine/version/v1.45/#tag/Container/operation/ContainerStats

No aggregation is happening on the client either:
https://github.com/moby/moby/blob/master/client/container_stats.go

Molter73 · 2025-06-24T13:32:45Z

integration-tests/pkg/executor/executor_docker_api.go

+	if err := decoder.Decode(&stats); err == io.EOF {
+		return nil, nil


Is this situation that the response was empty when we queried for the stats? Do we know what could cause this? Should we return some sort of error?

It will bubble up an warning "no stats" later. A reason could be e.g. an empty response body, if the stats api was called before Collector container was started.

Molter73 · 2025-06-24T13:35:50Z

integration-tests/suites/base.go

-	Mem       string
-	Cpu       float64
+	stats       []executor.ContainerStat
+	statsCtx    context.Context


Should we use this same context for the queries to the CRI and docker APIs? That way we can ensure everything is cancelled together.

It would require us to pass it down to the executor for that to work though.

Molter73 · 2025-06-24T13:39:13Z

integration-tests/suites/base.go

 }

 // StopCollector will tear down the collector container and stop
 // the MockSensor if it was started.
 func (s *IntegrationTestSuiteBase) StopCollector() {
+	// Stop stats collection first
+	s.statsCancel()


Calling the cancellation function does not mean the goroutine has successfully stopped, we might need some additional synchronization mechanism if we want to ensure the stats goroutine stops before collector to prevent gathering bogus data on the last measurement.

The other thing that worries me is the concurrency between getting new stat entries appended to the array and actually reading it when the tests are done.

Calling the cancellation function does not mean the goroutine has successfully stopped

Why not, the go routine checks the context channel:

case <-s.statsCtx.Done(): return

If the go routine for whatever reason still continues to run, it will get an empty response from the API, and will not add anything to the stats array (at least that's what happened when I caused this by mistake during development).

Molter73 · 2025-06-24T13:43:20Z

integration-tests/suites/base.go

@@ -187,33 +187,23 @@ func (s *IntegrationTestSuiteBase) RegisterCleanup(containers ...string) {
 	})
 }

-func (s *IntegrationTestSuiteBase) GetContainerStats() []ContainerStat {
+func (s *IntegrationTestSuiteBase) GetContainerStats() {


We might want to rename this method, looks like its behavior has changed significantly. For one, it's called Get... but it doesn't return anything and it now seems to capture stats in an array, rather than doing a single read like it used to. Maybe something like SnapshotContainerStats is more adequate?

Replace container-stats with golang client

0518786

Use CRI/Docker API clients to get container stats information, instead of spinnig up a new container (which can fail to mount docker socket and thus provide no information).

erthalion force-pushed the feature/container-stats-via-api branch from 4dfd831 to 0518786 Compare June 23, 2025 12:23

No need to access executor directly

6d7e570

erthalion marked this pull request as ready for review June 24, 2025 08:28

erthalion requested a review from a team as a code owner June 24, 2025 08:28

sourcery-ai bot reviewed Jun 24, 2025

View reviewed changes

Molter73 reviewed Jun 24, 2025

View reviewed changes

		if err := decoder.Decode(&stats); err == io.EOF {
		return nil, nil

Replace container-stats with golang client #2188

Are you sure you want to change the base?

Replace container-stats with golang client #2188

Uh oh!

Conversation

erthalion commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Testing Performed

Uh oh!

codecov-commenter commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

erthalion commented Jun 24, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Molter73 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

erthalion commented Jun 20, 2025 •

edited

Loading

codecov-commenter commented Jun 20, 2025 •

edited

Loading

Molter73 left a comment •

edited

Loading