Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

otel: investigate if we can decode assets only once per beat #42302

Open
mauri870 opened this issue Jan 13, 2025 · 1 comment
Open

otel: investigate if we can decode assets only once per beat #42302

mauri870 opened this issue Jan 13, 2025 · 1 comment
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@mauri870
Copy link
Member

When looking into #42180 and #42207 my understanding is that asset decoding happens once for every beat during initialization. This is okay since each beat runs as its own binary. Now with the beats receivers, it is possible that the otel collector configuration defines two or more receivers that are started by the collector in the same process via NewFactory().

Looking at the asset decoding, I believe we can have a map[beatName][]byte for decoded assets and return the same byte array for subsequent calls for the same beat. This would improve startup times when there are multiple beats receivers in the same collector instance.

For example, simply changing the existing BenchmarkFactory from filebeatreceiver to create 3 receivers per iteration results in the following improvement:

diff
diff --git a/libbeat/asset/registry.go b/libbeat/asset/registry.go
index fe34971c99..f40a4e919f 100644
--- a/libbeat/asset/registry.go
+++ b/libbeat/asset/registry.go
@@ -22,10 +22,14 @@ import (
        "compress/zlib"
        "encoding/base64"
        "sort"
+       "sync"

        "github.com/elastic/elastic-agent-libs/iobuf"
 )

+var assetCacheMu sync.Mutex
+var assetCache = map[string][]byte{}
+
 // FieldsRegistry contains a list of fields.yml files
 // As each entry is an array of bytes multiple fields.yml can be added under one path.
 // This can become useful as we don't have to generate anymore the fields.yml but can
@@ -107,6 +111,13 @@ func EncodeData(data string) (string, error) {

 // DecodeData base64 decodes the data and uncompresses it
 func DecodeData(data string) ([]byte, error) {
+       assetCacheMu.Lock()
+       defer assetCacheMu.Unlock()
+
+       if cached, ok := assetCache[data]; ok {
+               return cached, nil
+       }
+
        decoded, err := base64.StdEncoding.DecodeString(data)
        if err != nil {
                return nil, err
@@ -119,5 +130,11 @@ func DecodeData(data string) ([]byte, error) {
        }
        defer r.Close()

-       return iobuf.ReadAll(r)
+       out, err := iobuf.ReadAll(r)
+       if err != nil {
+               return nil, err
+       }
+
+       assetCache[data] = out
+       return out, nil
 }
diff --git a/x-pack/filebeat/fbreceiver/receiver_test.go b/x-pack/filebeat/fbreceiver/receiver_test.go
index 7da5c24f0a..235656bf5d 100644
--- a/x-pack/filebeat/fbreceiver/receiver_test.go
+++ b/x-pack/filebeat/fbreceiver/receiver_test.go
@@ -130,7 +130,9 @@ func BenchmarkFactory(b *testing.B) {

        b.ResetTimer()
        for i := 0; i < b.N; i++ {
-               _, err := NewFactory().CreateLogs(context.Background(), receiverSettings, cfg, nil)
-               require.NoError(b, err)
+               for _ = range 3 {
+                       _, err := NewFactory().CreateLogs(context.Background(), receiverSettings, cfg, nil)
+                       require.NoError(b, err)
+               }
        }
 }
goos: linux
goarch: amd64
pkg: github.com/elastic/beats/v7/x-pack/filebeat/fbreceiver
cpu: Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.60GHz
           │   old.txt    │               new.txt               │
           │    sec/op    │   sec/op     vs base                │
Factory-32   28.867m ± 2%   8.742m ± 4%  -69.72% (p=0.000 n=10)

           │   old.txt    │               new.txt                │
           │     B/op     │     B/op      vs base                │
Factory-32   32.17Mi ± 0%   14.76Mi ± 0%  -54.13% (p=0.000 n=10)

           │   old.txt   │               new.txt               │
           │  allocs/op  │  allocs/op   vs base                │
Factory-32   15.29k ± 0%   11.61k ± 0%  -24.09% (p=0.000 n=10)
@mauri870 mauri870 added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Jan 13, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

No branches or pull requests

2 participants