Flink Dynamic Sink #11536

pvary · 2024-11-13T11:47:54Z

Proposed Change

Flink Iceberg connector sink is the tool to write data to an Iceberg table from a continuous Flink stream. The current Sink implementations emphasize throughput over flexibility. The main limiting factor is that the Iceberg Flink Sink requires static table structure. The table, the schema, the partitioning specification need to be constant. If one of the previous things changes the Flink Job needs to be restarted. This allows using optimal record serialization and good performance, but real life use-cases need to work around this limitation when the underlying table has changed. We need to provide a tool to accommodate these changes.

The following typical use cases are considered during this design:

Incoming Avro records schema changes (new columns are added, or other backward compatible changes happen). The Flink job is expected to update the table schema dynamically, and continue to ingest data with the new and the old schema without a job restart.
Incoming records define the target Iceberg table dynamically. The Flink job is expected to create the new table(s) and continue writing to them without a job restart.
The partitioning schema of the table changes. The Flink job is expected to update the specification and continue writing to the target table without a job restart.

Proposal document

https://docs.google.com/document/d/1R3NZmi65S4lwnmNjH4gLCuXZbgvZV5GNrQKJ5NYdO9s

Specifications

github-actions · 2025-05-13T00:17:23Z

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

FranMorilloAWS · 2025-05-13T09:33:47Z

Any updates on this @pvary ?

pvary · 2025-05-14T14:45:06Z

@mxm is working on implementing this feature

pvary · 2025-05-14T14:46:16Z

See: #12424 for the whole PR, and it is broken down to smaller ones for easier review: #12996, #13032

pvary added the proposal Iceberg Improvement Proposal (spec/major changes/etc) label Nov 13, 2024

github-actions bot added the stale label May 13, 2025

github-actions bot removed the stale label May 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Flink Dynamic Sink #11536

Flink Dynamic Sink #11536

pvary commented Nov 13, 2024

github-actions bot commented May 13, 2025

Uh oh!

FranMorilloAWS commented May 13, 2025

Uh oh!

pvary commented May 14, 2025

Uh oh!

pvary commented May 14, 2025

Uh oh!

Flink Dynamic Sink #11536

Flink Dynamic Sink #11536

Comments

pvary commented Nov 13, 2024

Proposed Change

Proposal document

Specifications

github-actions bot commented May 13, 2025

Uh oh!

FranMorilloAWS commented May 13, 2025

Uh oh!

pvary commented May 14, 2025

Uh oh!

pvary commented May 14, 2025

Uh oh!