s3 output plugin should support append functionality #9720

adityasoni99 · 2024-12-12T22:41:59Z

Is your feature request related to a problem? Please describe.
S3 buckets didn't have the append capability so the S3 plugin for fluent-bit tends to append UUID at the end of the file name objects to avoid overwrite scenarios but this in-turn create many objects every hour for the same file.
In short, we do not have a way as of yet with fluenbit's s3 plugin and s3 general purpose buckets to have the entire files written to s3 from the source as-is with the same file-name.

Describe the solution you'd like
We'd like to maintain the files to be exactly as is at the destination rather than breaking them up into chunks with a uuid tag or something. S3 directory buckets recently enabled the append capability with the S3 Express One Zone storage class option using writeOffsetBytes parameter in the putObject API.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/directory-buckets-objects-append.html

Describe alternatives you've considered

We tried to write a lambda function that would detect any new object being written to s3 bucket and remove the UUID of it to have the original file name as the resultant file in the bucket. If the file with original file name already exists because of the earlier first object got processed by lambda, then we append the contents of the new object to the existing file. Here, we planned to use s3 general purpose bucket to send logs to with UUID, as fluentbit s3 output plugin currently supports general purpose buckets. The lambda would detect objects in this bucket and write/append to the destination directory bucket using PutObject API. This can't be used as executing lambda functions are expensive for huge number of files being processed simultaneously.
Another alternative could be to mount the s3 bucket using its csi driver into the eks pods/daemonsets and have cronjob to achieve the same thing that the above lambda function is doing but IMO, the better solution would be to have the append capability added to the current fluentbit's s3 output plugin.

Additional context
We want to send the log files of all the application pods running in EKS to a centralized location that would be s3 bucket here and the file structures should be same as that of source that would enable us to run the same extraction utilities on these log files by mounting the s3 bucket.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s3 output plugin should support append functionality #9720

s3 output plugin should support append functionality #9720

adityasoni99 commented Dec 12, 2024 •

edited

Loading

s3 output plugin should support append functionality #9720

s3 output plugin should support append functionality #9720

Comments

adityasoni99 commented Dec 12, 2024 • edited Loading

adityasoni99 commented Dec 12, 2024 •

edited

Loading