Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3 output plugin should support append functionality #9720

Open
adityasoni99 opened this issue Dec 12, 2024 · 0 comments
Open

s3 output plugin should support append functionality #9720

adityasoni99 opened this issue Dec 12, 2024 · 0 comments

Comments

@adityasoni99
Copy link

adityasoni99 commented Dec 12, 2024

Is your feature request related to a problem? Please describe.
S3 buckets didn't have the append capability so the S3 plugin for fluent-bit tends to append UUID at the end of the file name objects to avoid overwrite scenarios but this in-turn create many objects every hour for the same file.
In short, we do not have a way as of yet with fluenbit's s3 plugin and s3 general purpose buckets to have the entire files written to s3 from the source as-is with the same file-name.

Describe the solution you'd like
We'd like to maintain the files to be exactly as is at the destination rather than breaking them up into chunks with a uuid tag or something. S3 directory buckets recently enabled the append capability with the S3 Express One Zone storage class option using writeOffsetBytes parameter in the putObject API.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/directory-buckets-objects-append.html

Describe alternatives you've considered

  1. We tried to write a lambda function that would detect any new object being written to s3 bucket and remove the UUID of it to have the original file name as the resultant file in the bucket. If the file with original file name already exists because of the earlier first object got processed by lambda, then we append the contents of the new object to the existing file. Here, we planned to use s3 general purpose bucket to send logs to with UUID, as fluentbit s3 output plugin currently supports general purpose buckets. The lambda would detect objects in this bucket and write/append to the destination directory bucket using PutObject API. This can't be used as executing lambda functions are expensive for huge number of files being processed simultaneously.

  2. Another alternative could be to mount the s3 bucket using its csi driver into the eks pods/daemonsets and have cronjob to achieve the same thing that the above lambda function is doing but IMO, the better solution would be to have the append capability added to the current fluentbit's s3 output plugin.

Additional context
We want to send the log files of all the application pods running in EKS to a centralized location that would be s3 bucket here and the file structures should be same as that of source that would enable us to run the same extraction utilities on these log files by mounting the s3 bucket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant