Decouple `show jobs` from `CreateMviewProgressTracker` #19189

kwannoel · 2024-10-30T02:48:56Z

We can decouple rw_ddl_progress from meta’s materialized view progress tracker, and maintain it adhoc:

Make internal backfill state tables visible. They contain the backfilled row_count, and whether the backfill is finished or not.
a. Query all the internal backfill state tables for an MV to fetch row_count and finished status.
b. Query internal tables of the MV.
c. Regex the name for backfill tables.
Query the hummock version stats, so we can get the upstream row count.
Calculate the estimated progress.

This will:
Simplify logic of materialized view progress tracker. We no longer need to maintain state for counts.
Allow us to track the backfill progress of created sink jobs.
In general, I think the same pattern can be applied to other forms of backfilling, snapshot backfill and shared source backfill.

kwannoel · 2024-10-30T02:50:18Z

Implementation-wise, we need to consider querying permissions for the internal table.

kwannoel · 2024-11-27T01:20:59Z

This also has the added benefit where we can display finished jobs.

kwannoel · 2025-01-07T08:55:06Z

I've taken a look at implementing this. It can't really be implemented as a system catalog.

This is because the backfill internal state tables are dynamically queried. We can't statically construct a query to read from all backfilling tables. We can't query them via the meta node, nor can we query them via our batch interface (if it is constructed as a system catalog).

Instead, here's my idea:

We have a handler for show jobs, so it does not call rw_ddl_progress under the hood.
Inside this handler, there are two parts:
First part:
a. Fetch all backfilling relation ids from meta.
b. Construct the batch query using this tuple (select row_count, is_finished from <backfill_internal_state_table>)
c. Run the batch queries to get the progress stored in the state tables
Second part:
a. Just read the hummock metrics from the frontend for the tables being backfilled from.
From these 2 parts, we can construct the overall progress for all backfilling relations.

We can also improve UX further, by introducing new syntax:
SHOW JOBS WHERE relation_name = ..., db_name = ..., schema_name = ...,
since we don't support general purpose querying.

Shortcomings:

There can be some inconsistency between hummock metrics and backfill progress state, since they are not atomically fetched. I think it is acceptable since we are just providing an estimate.

BugenZhao · 2025-01-08T06:03:55Z

This is because the backfill internal state tables are dynamically queried.

Theoretically we can add a table function like table(id) to parameterize the tables to select, but I suppose it can be hard under current execution model. 🥵

so it does not call rw_ddl_progress under the hood.

Then what will be it look like? Still querying the meta service for basic information?

kwannoel · 2025-01-08T07:04:17Z

This is because the backfill internal state tables are dynamically queried.

Theoretically we can add a table function like table(id) to parameterize the tables to select, but I suppose it can be hard under current execution model. 🥵

so it does not call rw_ddl_progress under the hood.

Then what will be it look like? Still querying the meta service for basic information?

Oh I like the table function idea. I think it might be possible. Inside the binder we can query the catalog and resolve the table function to the actual backfill tables.

We can then add a separate system catalog which queries meta for basic info (specifically the hummock table stats).

Then we add another SQL view system catalog to query and join both of these.

In that way we don't need a special syntax.

kwannoel · 2025-01-10T02:55:04Z

I managed to make pretty good headway with this approach on the frontend side. I bound the table function and use an optimizer rule to rewrite it inside the optimizer, where we can construct the actual plan for it.

However, seems like the creating table catalog returned from the meta service is malformed. The reason is that vnode_count is not filled in, as of the time the meta node updates the catalog back to the FE service.

We need to update the table catalog back to FE, only after the catalog has this vnode_count field populated.

Can try convert vnode_count::placeholder -> 0.0 progress.

kwannoel · 2025-01-10T03:44:03Z

Continue working on this after #17501

kwannoel added the type/feature label Oct 30, 2024

github-actions bot added this to the release-2.2 milestone Oct 30, 2024

kwannoel self-assigned this Oct 30, 2024

kwannoel modified the milestones: release-2.2, release-2.3 Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple `show jobs` from `CreateMviewProgressTracker` #19189

Decouple `show jobs` from `CreateMviewProgressTracker` #19189

kwannoel commented Oct 30, 2024

kwannoel commented Oct 30, 2024

kwannoel commented Nov 27, 2024

kwannoel commented Jan 7, 2025

BugenZhao commented Jan 8, 2025

kwannoel commented Jan 8, 2025 •

edited

Loading

kwannoel commented Jan 10, 2025 •

edited

Loading

kwannoel commented Jan 10, 2025

Decouple show jobs from CreateMviewProgressTracker #19189

Decouple show jobs from CreateMviewProgressTracker #19189

Comments

kwannoel commented Oct 30, 2024

kwannoel commented Oct 30, 2024

kwannoel commented Nov 27, 2024

kwannoel commented Jan 7, 2025

BugenZhao commented Jan 8, 2025

kwannoel commented Jan 8, 2025 • edited Loading

kwannoel commented Jan 10, 2025 • edited Loading

kwannoel commented Jan 10, 2025

Decouple `show jobs` from `CreateMviewProgressTracker` #19189

Decouple `show jobs` from `CreateMviewProgressTracker` #19189

kwannoel commented Jan 8, 2025 •

edited

Loading

kwannoel commented Jan 10, 2025 •

edited

Loading