Skip to content

[FLINK-26425][yarn] Support rolling log aggregation #26508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 28, 2025

Conversation

ferenc-csaky
Copy link
Contributor

What is the purpose of the change

A long-running YARN app will eventually fill the local disk with logs, unless there is a rolling Log4J strategy is applied to limit the log files to X archive with Y size. But if there is a policy to store the logs for months or years, throwing away the logs too early is also problematic. YARN is able to aggregate specific files for running applications, so this way, it is possible to aggregate the rolled over logs to external storage and store it until the end of times if that's the reqquirement. :)

Brief change log

  • Added new optional config options to define include and exclude regex patterns.
  • Wired in these values into YARN's LogAggregationContext during cluster deployment.
  • Updated docs.

Verifying this change

  • Added new unit test for the added logic.
  • Existing unit tests guarantee that by default the deployment will work exactly as before.
  • Also E2E tested on a YARN cluster, and log aggregation is triggered for the given files as expected.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: yes
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs

@ferenc-csaky ferenc-csaky changed the title [FLINK-26425][yarn] Support YARN rolling log aggregation [FLINK-26425][yarn] Support rolling log aggregation Apr 25, 2025
@flinkbot
Copy link
Collaborator

flinkbot commented Apr 25, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@ferenc-csaky
Copy link
Contributor Author

Thanks for the quick review @gaborgsomogyi! If CI is green and no objections until then, will merge this by Monday EOD (CEST).

@ferenc-csaky ferenc-csaky merged commit b523264 into apache:master Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants