[Improment] The issue of indexing large volumes of data tables. #2253

wg1026688210 · 2023-11-03T04:10:32Z

Search before asking

I searched in the issues and found nothing similar.

Motivation

Now we have seen significant improvements in query performance through index creation. However, we have encountered several issues during the indexing process.

Some tasks before sorting and writing to Paimon are getting stuck waiting for data for a long time.
Due to data skewness, some indexing tasks take a long time to execute.
The flink batch Job of building index will global failover after taskManager oom
FilesTable can not query the paritions which has built index

Solution

No response

Anything else?

No response

Are you willing to submit a PR?

I'm willing to submit a PR!

wg1026688210 · 2024-04-08T08:05:26Z

issue 1,2 has been fixed
related
#3081
#2749

issue 3 need a remote suffle service for task failover when using flink

wg1026688210 added the enhancement New feature or request label Nov 3, 2023

wg1026688210 closed this as completed Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Improment] The issue of indexing large volumes of data tables. #2253

[Improment] The issue of indexing large volumes of data tables. #2253

wg1026688210 commented Nov 3, 2023 •

edited

Loading

wg1026688210 commented Apr 8, 2024

[Improment] The issue of indexing large volumes of data tables. #2253

[Improment] The issue of indexing large volumes of data tables. #2253

Comments

wg1026688210 commented Nov 3, 2023 • edited Loading

Search before asking

Motivation

Solution

Anything else?

Are you willing to submit a PR?

wg1026688210 commented Apr 8, 2024

wg1026688210 commented Nov 3, 2023 •

edited

Loading