flink:FlinkSink support dynamically changed schema #4190

naisongwen · 2022-02-22T09:12:38Z

Now,FlinkSink requires developers to transfer the schema parameter to build DataStream, which means once the schema given,then the TableSchema will be determinded, canot be changed for ever ,but in practical scenarios,some data format,for example ,JSON,the tranformed schema is unfixed with field added,deleted or renamed and so on,so we want to change the mapped TableSchema while the DataStream is running.

Shane-Yu · 2022-03-02T13:53:21Z

I also met this problem in CDC scenarios

hililiwei · 2022-03-25T08:42:53Z

To summarize, your expectation is to dynamically update Iceberg table schema based on the schema of the data flow when the schema of the data flow does not match the Iceberg table?

stevenzwu · 2022-03-25T19:02:12Z

I am not sure this would the universally desired behavior. if data stream contains incompatible schema change (like removing a required field), it will break the downstream consumers.

there is a value of automatically syncing input data schema to Iceberg table schema (for compatible schema evolution). Personally, I would like to keep it at the control plane, which would be more natural if there is a schema registry for tracking input data schema change. Control plane can then update Iceberg table schema and restart the Flink job to pick up new Iceberg table schema for write path.

It is tricky to support in automatic schema sync in the data plane. There would be parallel Iceberg writers (like hundreds) for a single sink table. Coordinating metadata (like schema) change is very tricky.

naisongwen · 2022-03-30T06:51:16Z

right

lintingbin · 2022-05-20T04:03:22Z

We also met this problem in our scenarios. We modify schema of the sink iceberg table frequently. Now I try to get the table schema through iceberg table refresh api to dynamically construct a DataStream to add new columns. It is possible to add new columns, but writing new columns to iceberg sink does not take effect, because sink does not support dynamic update of schema.If iceberg flink sink writer can add iceberg table refresh function support, it will be more convenient to use. After all, it takes a minute or two to restart flink now. @stevenzwu

hililiwei · 2022-06-21T03:10:25Z

I think we should take this feature seriously. In fact, in the Flink CDC, HUDI already supports dynamic table schema changes without restarting tasks. It captures schema changes to the table and updates it in real time.

When synchronizing data using flink CDC, it is unacceptable to restart a task if the table schema changes.

lintingbin · 2022-08-02T09:16:06Z

We have internally implemented modifying columns, adding columns after the last column, and deleting the last column without restarting the flink program. Our processing logic is as follows: DataStream<Map<String, String>> -> map -> DataStream -> FlinkSink. In the implementation of map, we will refresh the table schema to generate the latest RowData after each checkpoint is done. At the same time, we have also modified the implementation of FlinkSink. Every time newAppender is called, we will refresh the table schema and use the latest table schema to write the data file. Is anyone interested in this feature? I can contribute our modifications to FlinkSink if needed.

hililiwei · 2022-08-03T01:46:14Z

I personally look forward to seeing your PR.

stevenzwu · 2022-08-03T04:24:53Z

@lintingbin2009 it might be helpful to describe the solution at high-level design in this issue or some doc.

Every time newAppender is called, we will refresh the table schema and use the latest table schema to write the data file

This sounds expensive and may not work well at scale. if every writer task needs to poll table for every file, it can create a lot of load on the Iceberg metadata system. Ideally, the table schema polling and change should done by operator coordinator.

lintingbin · 2022-08-03T06:48:04Z

@hililiwei @stevenzwu #5425 This is my PR. Hope to have some suggestions. Now we test in an environment with a parallelism of about 40. The checkpoint time is the same as before the dynamic refresh schema is not added.

leichangqing · 2023-03-10T01:17:07Z

We also met this problem in our scenarios. We modify schema of the sink iceberg table frequently. Now I try to get the table schema through iceberg table refresh api to dynamically construct a DataStream to add new columns. It is possible to add new columns, but writing new columns to iceberg sink does not take effect, because sink does not support dynamic update of schema.If iceberg flink sink writer can add iceberg table refresh function support, it will be more convenient to use. After all, it takes a minute or two to restart flink now. @stevenzwu

I have interesting on this, how to contact with u about dynamical schema

lintingbin · 2023-03-10T04:03:19Z

@leichangqing You can refer to the last two commits of my branch https://github.com/lintingbin2009/iceberg/tree/flink-sink-dynamically-change. We have put this part of the code in our own production environment and executed it for half a year, and it seems that there is no problem so far.

Corresponding to 1, 2, 3 in the above picture, you need to prepare:

There needs to be a broadcast node that can subscribe to your schema changes.
The data processing node can generate RowData according to the latest schema processing data.
This is based on the above code modification so that the iceberg writer node can receive the latest schema and apply it.

stevenzwu · 2023-03-10T04:30:01Z

It will be a good starting point if someone likes to create a design doc on how to solve this problem in a general and scalable way

lintingbin · 2023-03-15T06:19:34Z

@stevenzwu This is a doc I wrote, you can give your opinion, and I will modify it.

FranMorilloAWS · 2024-02-28T16:25:22Z

Is there any news on this?

Ruees · 2024-04-16T03:08:24Z

@leichangqing You can refer to the last two commits of my branch https://github.com/lintingbin2009/iceberg/tree/flink-sink-dynamically-change. We have put this part of the code in our own production environment and executed it for half a year, and it seems that there is no problem so far. Corresponding to 1, 2, 3 in the above picture, you need to prepare:

There needs to be a broadcast node that can subscribe to your schema changes.

The data processing node can generate RowData according to the latest schema processing data.

This is based on the above code modification so that the iceberg writer node can receive the latest schema and apply it.

Can there be sample code to demonstrate how to use it?

Ruees · 2024-04-16T09:27:56Z

@leichangqing You can refer to the last two commits of my branch https://github.com/lintingbin2009/iceberg/tree/flink-sink-dynamically-change. We have put this part of the code in our own production environment and executed it for half a year, and it seems that there is no problem so far. Corresponding to 1, 2, 3 in the above picture, you need to prepare:

There needs to be a broadcast node that can subscribe to your schema changes.

The data processing node can generate RowData according to the latest schema processing data.

This is based on the above code modification so that the iceberg writer node can receive the latest schema and apply it.

I tried to pull the Flinksink related modification code for the first commit and added a column at the end using Java API in the map operator, but the result was not successful. Even after inserting the data successfully, the column at the end was still empty

lkokhreidze · 2024-04-17T10:34:54Z

Just commenting for visibility that this feature would be extremely useful for our use case too. It's similar to CDC use case but instead driven by the services emitting events.
I'd also be happy to lend a hand, but at the moment, it's not clear what the state is. Is the proposed design agreed upon, or does it need re-iteration?

pvary · 2024-04-17T17:30:07Z

I think it is not trivial to implement this feature, as the schema of the RowData objects which are the input of the Sink is finalized when the job graph is created. To change the schema one need to regenerate the job graph, essentially restarting the job (calling the main method).
There might be some way to work around this, by changing the input to records where the schema is embedded to the records (performance loss), or getting the schema from an outside source (additional external depenency), but this would need some deeper changes in the Sink.
Also care should be taken, how to synchronize the table schema refresh throughout the tasks when the changes are detected...

As a workaround, we created our own schema check before converting the input to RowData, and throw a SuppressRestartsException when changes are detected.
We used Flink Kubernetes Operator to restart the job from failed state, using kubernetes.operator.job.restart.failed. The main method refreshes the table and the new job instance is started with the new schema.

github-actions · 2024-10-15T00:15:02Z

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

ottomata · 2024-10-15T12:58:23Z

How does flink-cdc do it?

pvary · 2024-10-17T20:01:22Z

It sends the schema along with every record. I'm playing around with a somewhat similar, but more performant solution, where we send only the schemaId instead of the full schema. The thing is only in infancy atm, but... 😄

ottomata · 2024-10-17T21:16:19Z

Ah, thanks!

FWIW, I think schema evolution support is worth the tradeoff of extra bytes per record :)

pvary · 2024-10-18T05:19:14Z

The current tradeoff is more like doubled CPU time (we need caching and an extra serialization/deserialization step, which is on an already well optimized hot path). We are still looking for ways to optimize this.

FranMorilloAWS · 2024-10-18T06:15:30Z

How is Paimon doing it? Same as Flink - CDC? How bout integrating with a schema registry and we use the schemaversionid

pvary · 2024-10-21T04:38:16Z

The Iceberg table could be used as a schema registry. I would be reluctant to add any new requirements if possible

FranMorilloAWS · 2024-10-21T07:13:21Z

How would that look? So normally we consume from Kafka or Kinesis and use glue schema registry or confluent schema registry. As of now the Sink has the option of using Generic Record. Is there any roadmap for Specific?

ottomata · 2024-10-21T11:32:47Z

If schema registry support is considered, please make it pluggable! Wikimedia Foundation uses JSONSchema and $schema URIs. And we can do it with Flink.

pvary · 2024-10-22T12:43:04Z

If the schema is changed, then the target Iceberg table needs to be updated to the new schema anyways. So we can use the Iceberg schemaId to send along the records.

github-actions · 2025-04-21T00:17:40Z

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

ottomata · 2025-04-22T18:47:18Z

Still very interested in this! WMF is already using Iceberg, but Paimon can currently support this feature. We'd prefer to use Iceberg directly if we can!

pvary · 2025-04-29T12:08:07Z

You might want to take a look at #12424

stevenzwu mentioned this issue Aug 8, 2022

feat: FlinkSink support dynamically changed schema #5425

Closed

github-actions bot added the stale label Oct 15, 2024

github-actions bot removed the stale label Oct 19, 2024

github-actions bot added the stale label Apr 21, 2025

github-actions bot removed the stale label Apr 23, 2025

flink:FlinkSink support dynamically changed schema #4190

flink:FlinkSink support dynamically changed schema #4190

Comments

naisongwen commented Feb 22, 2022

Shane-Yu commented Mar 2, 2022

Uh oh!

hililiwei commented Mar 25, 2022

Uh oh!

stevenzwu commented Mar 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

naisongwen commented Mar 30, 2022

Uh oh!

lintingbin commented May 20, 2022

Uh oh!

hililiwei commented Jun 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lintingbin commented Aug 2, 2022

Uh oh!

hililiwei commented Aug 3, 2022

Uh oh!

stevenzwu commented Aug 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lintingbin commented Aug 3, 2022

Uh oh!

leichangqing commented Mar 10, 2023

Uh oh!

lintingbin commented Mar 10, 2023

Uh oh!

stevenzwu commented Mar 10, 2023

Uh oh!

lintingbin commented Mar 15, 2023

Uh oh!

FranMorilloAWS commented Feb 28, 2024

Uh oh!

Ruees commented Apr 16, 2024

Uh oh!

Ruees commented Apr 16, 2024

Uh oh!

lkokhreidze commented Apr 17, 2024

Uh oh!

pvary commented Apr 17, 2024

Uh oh!

github-actions bot commented Oct 15, 2024

Uh oh!

ottomata commented Oct 15, 2024

Uh oh!

pvary commented Oct 17, 2024

Uh oh!

ottomata commented Oct 17, 2024

Uh oh!

pvary commented Oct 18, 2024

Uh oh!

FranMorilloAWS commented Oct 18, 2024

Uh oh!

pvary commented Oct 21, 2024

Uh oh!

FranMorilloAWS commented Oct 21, 2024

Uh oh!

ottomata commented Oct 21, 2024

Uh oh!

pvary commented Oct 22, 2024

Uh oh!

github-actions bot commented Apr 21, 2025

Uh oh!

ottomata commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pvary commented Apr 29, 2025

Uh oh!

stevenzwu commented Mar 25, 2022 •

edited

Loading

hililiwei commented Jun 21, 2022 •

edited

Loading

stevenzwu commented Aug 3, 2022 •

edited

Loading

ottomata commented Apr 22, 2025 •

edited

Loading