Skip to content

Commit

Permalink
Add support for podman metrics in docker module (elastic#41889)
Browse files Browse the repository at this point in the history
* Add support for podman metrics
  • Loading branch information
MichaelKatsoulis authored and michalpristas committed Dec 13, 2024
1 parent 69546ec commit ac2c179
Show file tree
Hide file tree
Showing 14 changed files with 76 additions and 16 deletions.
9 changes: 9 additions & 0 deletions metricbeat/docs/modules/docker.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ The Docker module is currently tested on Linux and Mac with the community
edition engine, versions 1.11 and 17.09.0-ce. It is not tested on Windows,
but it should also work there.

The Docker module supports collection of metrics from Podman's Docker-compatible API.
It has been tested on Linux and Mac with Podman Rest API v2.0.0 and above.

[float]
=== Module-specific configuration notes

Expand All @@ -30,6 +33,9 @@ It is strongly recommended that you run Docker metricsets with a
Docker API already takes up to 2 seconds. Specifying less than 3 seconds will
result in requests that timeout, and no data will be reported for those
requests.
In the case of Podman, the configuration parameter `podman` should be set to `true`.
This enables streaming of container stats output, which allows for more accurate
CPU percentage calculations when using Podman.


:edit_url:
Expand Down Expand Up @@ -62,6 +68,9 @@ metricbeat.modules:
# If set to true, replace dots in labels with `_`.
#labels.dedot: false
# Docker module supports metrics collection from podman's docker compatible API. In case of podman set to true.
# podman: false
# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
Expand Down
3 changes: 3 additions & 0 deletions metricbeat/metricbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -268,6 +268,9 @@ metricbeat.modules:
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's docker compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
Expand Down
3 changes: 3 additions & 0 deletions metricbeat/module/docker/_meta/config.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's docker compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
Expand Down
3 changes: 3 additions & 0 deletions metricbeat/module/docker/_meta/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's Docker-compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
Expand Down
6 changes: 6 additions & 0 deletions metricbeat/module/docker/_meta/docs.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ The Docker module is currently tested on Linux and Mac with the community
edition engine, versions 1.11 and 17.09.0-ce. It is not tested on Windows,
but it should also work there.

The Docker module supports collection of metrics from Podman's Docker-compatible API.
It has been tested on Linux and Mac with Podman Rest API v2.0.0 and above.

[float]
=== Module-specific configuration notes

Expand All @@ -19,3 +22,6 @@ It is strongly recommended that you run Docker metricsets with a
Docker API already takes up to 2 seconds. Specifying less than 3 seconds will
result in requests that timeout, and no data will be reported for those
requests.
In the case of Podman, the configuration parameter `podman` should be set to `true`.
This enables streaming of container stats output, which allows for more accurate
CPU percentage calculations when using Podman.
8 changes: 5 additions & 3 deletions metricbeat/module/docker/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,16 @@ package docker

// Config contains the config needed for the docker
type Config struct {
TLS *TLSConfig `config:"ssl"`
DeDot bool `config:"labels.dedot"`
TLS *TLSConfig `config:"ssl"`
DeDot bool `config:"labels.dedot"`
Podman bool `config:"podman"`
}

// DefaultConfig returns default module config
func DefaultConfig() Config {
return Config{
DeDot: true,
DeDot: true,
Podman: false,
}
}

Expand Down
4 changes: 3 additions & 1 deletion metricbeat/module/docker/cpu/cpu.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ type MetricSet struct {
cpuService *CPUService
dockerClient *client.Client
dedot bool
podman bool
}

// New creates a new instance of the docker cpu MetricSet.
Expand Down Expand Up @@ -68,12 +69,13 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
dockerClient: client,
cpuService: &CPUService{Cores: cpuConfig.Cores},
dedot: config.DeDot,
podman: config.Podman,
}, nil
}

// Fetch returns a list of docker CPU stats.
func (m *MetricSet) Fetch(r mb.ReporterV2) error {
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, m.podman, m.Logger())
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
Expand Down
2 changes: 1 addition & 1 deletion metricbeat/module/docker/diskio/diskio.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {

// Fetch creates list of events with diskio stats for all containers.
func (m *MetricSet) Fetch(r mb.ReporterV2) error {
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, false, m.Logger())
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
Expand Down
40 changes: 32 additions & 8 deletions metricbeat/module/docker/docker.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ import (
"github.com/elastic/beats/v7/metricbeat/mb"
"github.com/elastic/beats/v7/metricbeat/mb/parse"
"github.com/elastic/elastic-agent-autodiscover/docker"
"github.com/elastic/elastic-agent-libs/logp"
)

// HostParser is a TCP host parser function for docker tcp host addresses
Expand Down Expand Up @@ -91,7 +92,7 @@ func NewDockerClient(endpoint string, config Config) (*client.Client, error) {
}

// FetchStats returns a list of running containers with all related stats inside
func FetchStats(client *client.Client, timeout time.Duration) ([]Stat, error) {
func FetchStats(client *client.Client, timeout time.Duration, stream bool, logger *logp.Logger) ([]Stat, error) {
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
containers, err := client.ContainerList(ctx, container.ListOptions{})
Expand All @@ -108,7 +109,7 @@ func FetchStats(client *client.Client, timeout time.Duration) ([]Stat, error) {
for _, container := range containers {
go func(container types.Container) {
defer wg.Done()
statsQueue <- exportContainerStats(ctx, client, &container)
statsQueue <- exportContainerStats(ctx, client, &container, stream, logger)
}(container)
}

Expand All @@ -133,18 +134,41 @@ func FetchStats(client *client.Client, timeout time.Duration) ([]Stat, error) {
// This is currently very inefficient as docker calculates the average for each request,
// means each request will take at least 2s: https://github.com/docker/docker/blob/master/cli/command/container/stats_helpers.go#L148
// Getting all stats at once is implemented here: https://github.com/docker/docker/pull/25361
func exportContainerStats(ctx context.Context, client *client.Client, container *types.Container) Stat {
// In case stream is true, we use get a stream of results for container stats. From the stream we keep the second result.
// This is needed for podman use case where in case stream is false, no precpu stats are returned. The precpu stats
// are required for the cpu percentage calculation. We keep the second result as in the first result, the stats are not correct.
func exportContainerStats(ctx context.Context, client *client.Client, container *types.Container, stream bool, logger *logp.Logger) Stat {
var event Stat
event.Container = container

containerStats, err := client.ContainerStats(ctx, container.ID, false)
containerStats, err := client.ContainerStats(ctx, container.ID, stream)
if err != nil {
logger.Debugf("Failed fetching container stats: %v", err)
return event
}

defer containerStats.Body.Close()
decoder := json.NewDecoder(containerStats.Body)
decoder.Decode(&event.Stats)

// JSON decoder
decoder := json.NewDecoder(containerStats.Body)
if !stream {
if err := decoder.Decode(&event.Stats); err != nil {
logger.Debugf("Failed decoding event: %v", err)
return event
}
} else {
// handle stream. Take the second result.
count := 0
for decoder.More() {
if err := decoder.Decode(&event.Stats); err != nil {
logger.Debugf("Failed decoding event: %v", err)
return event
}

count++
// Exit after the second result
if count == 2 {
break
}
}
}
return event
}
4 changes: 3 additions & 1 deletion metricbeat/module/docker/memory/memory.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ type MetricSet struct {
memoryService *MemoryService
dockerClient *client.Client
dedot bool
podman bool
logger *logp.Logger
}

Expand All @@ -64,13 +65,14 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
memoryService: &MemoryService{},
dockerClient: dockerClient,
dedot: config.DeDot,
podman: config.Podman,
logger: logger,
}, nil
}

// Fetch creates a list of memory events for each container.
func (m *MetricSet) Fetch(r mb.ReporterV2) error {
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, m.podman, m.Logger())
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
Expand Down
2 changes: 1 addition & 1 deletion metricbeat/module/docker/network/network.go
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {

// Fetch methods creates a list of network events for each container.
func (m *MetricSet) Fetch(r mb.ReporterV2) error {
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, false, m.Logger())
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
// of an error set the Error field of mb.Event or simply call report.Error().
func (m *MetricSet) Fetch(ctx context.Context, report mb.ReporterV2) error {

stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout)
stats, err := docker.FetchStats(m.dockerClient, m.Module().Config().Timeout, false, m.Logger())
if err != nil {
return fmt.Errorf("failed to get docker stats: %w", err)
}
Expand Down
3 changes: 3 additions & 0 deletions metricbeat/modules.d/docker.yml.disabled
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's Docker-compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
Expand Down
3 changes: 3 additions & 0 deletions x-pack/metricbeat/metricbeat.reference.yml
Original file line number Diff line number Diff line change
Expand Up @@ -521,6 +521,9 @@ metricbeat.modules:
# If set to true, replace dots in labels with `_`.
#labels.dedot: false

# Docker module supports metrics collection from podman's docker compatible API. In case of podman set to true.
# podman: false

# Skip metrics for certain device major numbers in docker/diskio.
# Necessary on systems with software RAID, device mappers,
# or other configurations where virtual disks will sum metrics from other disks.
Expand Down

0 comments on commit ac2c179

Please sign in to comment.