Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiline: introduce static cri parser backend, providing 9mb/s performance gain over current regex cri parser #9418

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ryanohnemus
Copy link
Contributor

As part of #9399, I have been looking into performance improvements in fluent-bit. Specifically for apps in k8s that are creating many logs within a short amount of time (ie logging >15mb/s).

Test setup:
Hardware: GCP n2-standard-8 w/ ssd bootdisk
With config: (see full config in example config section)

  inputs:
    - name: tail
      path: /app/cache/containers/*.log
      tag: kube.*
      skip_long_lines: on
      refresh_interval: 1
      buffer_chunk_size: 250K
      buffer_max_size: 250K
      threaded: on
      rotate_wait: 300
      storage.type: filesystem
  outputs:
    - name: "null"
      match: "*"
      workers: 3
  1. with no multiline.parser: cri (or any parser present) - i am able to achieve - 49.224Mb/s throughput
  2. adding a standard cri parser - regex (current) version - fluent-bit 3.1.8
  • multiline.parser: cri added to tail input above - throughput is 18.571Mb/s
  1. using the cri parser from this PR -
  • multiline.parser: criadded to tail input above - throughput is 27.572Mb/s ~9mb/s gain by not using a regex parser

Improving the multiline parsing speed should also considerably cut down on issues from log tail rotations (log rotations missed when fluentbit is backed up), since parsing is much more performant.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

service:
  flush: 0.5
  log_level: info
  daemon: off
  parsers_file: parsers.conf
  http_server: on
  http_listen: 0.0.0.0
  http_port: 2030
  storage.metrics: on
  storage.path: /app/cache/flb_storage
  storage.backlog.mem_limit: 512M
  storage.delete_irrecoverable_chunks: on
  storage.total_limit_size: 25G
pipeline:
  inputs:
    - name: tail
      path: /app/cache/containers/*.log
      tag: kube.*
      skip_long_lines: on
      refresh_interval: 1
      buffer_chunk_size: 250K
      buffer_max_size: 250K
      threaded: on
      multiline.parser: cri
      rotate_wait: 300
      storage.type: filesystem
  outputs:
    - name: "null"
      match: "*"
      workers: 3
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found
valgrind --leak-check=full ./bin/flb-it-multiline
...

SUCCESS: All unit tests have passed.
==45752== 
==45752== HEAP SUMMARY:
==45752==     in use at exit: 0 bytes in 0 blocks
==45752==   total heap usage: 22,203 allocs, 22,203 frees, 7,605,275 bytes allocated
==45752== 
==45752== All heap blocks were freed -- no leaks are possible
==45752== 
==45752== Use --track-origins=yes to see where uninitialised values come from
==45752== For lists of detected and suppressed errors, rerun with: -s
==45752== ERROR SUMMARY: 32 errors from 2 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature
  • no new documentation, parser is meant to be backward compatible

@ryanohnemus
Copy link
Contributor Author

@patrick-stephens @edsiper - i fixed the fuzzer issues and it should be ready for review but now all of the unit tests are failing on a custom_calyptia.sh which seems new and unrelated to this change.

@patrick-stephens
Copy link
Contributor

@patrick-stephens @edsiper - i fixed the fuzzer issues and it should be ready for review but now all of the unit tests are failing on a custom_calyptia.sh which seems new and unrelated to this change.

Yup I'm looking at that, the test is triggering a spurious failure in the framework on macOS so I'll resolve but it's unrelated to your change so can be ignored

@patrick-stephens
Copy link
Contributor

@patrick-stephens @edsiper - i fixed the fuzzer issues and it should be ready for review but now all of the unit tests are failing on a custom_calyptia.sh which seems new and unrelated to this change.

Yup I'm looking at that, the test is triggering a spurious failure in the framework on macOS so I'll resolve but it's unrelated to your change so can be ignored

@ryanohnemus this should be resolved I think now on latest master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants