-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[exporter/elasticsearch] Limit bulk request size to avoid 413 Entity Too Large #36396
[exporter/elasticsearch] Limit bulk request size to avoid 413 Entity Too Large #36396
Conversation
flush::bytes
flush::bytes
flush::bytes
in sync bulk indexer
Also benched with ~5MB vs ~1MB vs 95MB flush::bytes to an Elastic Cloud 64GB Elasticsearch, with num_workers=1: 5MB:
1MB:
95MB:
|
flush::bytes
in sync bulk indexerThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to change the default max size items? Other changes LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
}, | ||
}, | ||
Flush: FlushSettings{ | ||
Bytes: 5e+6, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe consider moving these defaults to some constant vars at some point. (not for this PR)
Co-authored-by: Christos Markou <[email protected]>
This code was initially added in elastic#41523 because of a limitation from the elasticsearch exporter. The elasticsearch exporter has been updated to enforce flush::max_bytes for the batcher extension and will automatically split the batch if it exceeds the limit. This error is now fixed in the collector v0.115.0. See open-telemetry/opentelemetry-collector-contrib#36396.
This code was initially added in #41523 because of a limitation from the elasticsearch exporter. The elasticsearch exporter has been updated to enforce flush::max_bytes for the batcher extension and will automatically split the batch if it exceeds the limit. This error is now fixed in the collector v0.115.0. See open-telemetry/opentelemetry-collector-contrib#36396.
This code was initially added in #41523 because of a limitation from the elasticsearch exporter. The elasticsearch exporter has been updated to enforce flush::max_bytes for the batcher extension and will automatically split the batch if it exceeds the limit. This error is now fixed in the collector v0.115.0. See open-telemetry/opentelemetry-collector-contrib#36396. (cherry picked from commit dbeb9cd)
…41971) This code was initially added in #41523 because of a limitation from the elasticsearch exporter. The elasticsearch exporter has been updated to enforce flush::max_bytes for the batcher extension and will automatically split the batch if it exceeds the limit. This error is now fixed in the collector v0.115.0. See open-telemetry/opentelemetry-collector-contrib#36396. (cherry picked from commit dbeb9cd) Co-authored-by: Mauri de Souza Meneguzzo <[email protected]>
…41911) This code was initially added in elastic#41523 because of a limitation from the elasticsearch exporter. The elasticsearch exporter has been updated to enforce flush::max_bytes for the batcher extension and will automatically split the batch if it exceeds the limit. This error is now fixed in the collector v0.115.0. See open-telemetry/opentelemetry-collector-contrib#36396.
…Too Large (open-telemetry#36396) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Limit the bulk request size to roughly `flush::bytes` for sync bulk indexer. Sync bulk indexer is used when `batcher::enabled` is either true or false. In order words, sync bulk indexer is not used when batcher config is undefined. Change `flush::bytes` to always measure in uncompressed bytes. Change default `batcher::max_size_items` to `0` as bulk request size limit is now more effectively enforced by `flush::bytes`. <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#36163 <!--Describe what testing was performed and which tests were added.--> #### Testing Modified BenchmarkExporter to run with `{name: "xxlarge_batch", batchSize: 1000000},` and removed `batcher::max_size_items` and added a log line for compressed and uncompressed buffer size to reproduce the error. ``` logger.go:146: 2024-11-19T17:16:40.060Z ERROR Flush {"s.bi.Len": 10382932, "s.bi.UncompressedLen": 532777786} logger.go:146: 2024-11-19T17:16:40.312Z ERROR bulk indexer flush error {"error": "flush failed (413): [413 Request Entity Too Large] "} ``` With this PR, every flush logs and there is no error. ``` logger.go:146: 2024-11-19T17:23:52.574Z ERROR Flush {"s.bi.Len": 99148, "s.bi.UncompressedLen": 5000007} ``` <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Co-authored-by: Christos Markou <[email protected]>
Description
Limit the bulk request size to roughly
flush::bytes
for sync bulk indexer.Sync bulk indexer is used when
batcher::enabled
is either true or false. In order words, sync bulk indexer is not used when batcher config is undefined.Change
flush::bytes
to always measure in uncompressed bytes.Change default
batcher::max_size_items
to0
as bulk request size limit is now more effectively enforced byflush::bytes
.Link to tracking issue
Fixes #36163
Testing
Modified BenchmarkExporter to run with
{name: "xxlarge_batch", batchSize: 1000000},
and removedbatcher::max_size_items
and added a log line for compressed and uncompressed buffer size to reproduce the error.With this PR, every flush logs and there is no error.
Documentation