Enable high frequency non-checkpoint barrier without barrier read #12393

hzxa21 · 2023-09-18T11:21:58Z

In #4290, we have introduced two features: Barrier & Checkpoint decoupling and Read on non-checkpointed epoch. The main motivation back then was to improve the freshness for batch query without high-frequency checkpoint, which made these two feature correlated to each other. However, this also introduces some complexities:

Since non-checkpoint epoch is considerable readable, on seeing a non-checkpointed barrier, a memtable flush must be triggered, which can generate many immutable memtables. We later introduce perf(storage): Merge multiple imms in the staging version to a large one #7368 to merge IMMs, which made the codes more complex.
There are some discrepancies in batch query scheduling when reading non-checkpoint and checkpoint epoch since non-checkpointed states are only available in the writer CN. This also makes it hard to support non-checkpoint epoch read when we have dedicated serving cluster. Also, the cluster is unavailable to non-checkpoint epoch read during recovery.
There are some discrepancies in metadata management. We need to explicitly maintain the max "committed" non-checkpoint epoch and have special logic for epoch pin/unpin.

Due to the above reasons, we recently by default turn off barrier & checkpoint decoupling and barrier read by setting checkpoint_frequency=1, barrier_interval=1s and visibility_mode=checkpoint.

However, in recent discussions, we realize that Barrier & Checkpoint decoupling is actually independent to Read on non-checkpointed epoch and enabling the former without the later can bring us the benefits of high frequency barrier without the extra complexities

Higher frequency of barrier means more timely triggers for the following operations:
- barrier alignment -> better backpressure (considering inner join on fast & slow stream)
- operator cache eviction -> less chance for OOM
- memtable spill -> less chance for OOM
- agg result emisson -> less chance for OOM
No guaratees on barrier read means we can keep things simple
- No need to worry about batch query scheduling
- No need to force a memtable flush on non-checkpoint barrier. We can try flush only when needed.

Action items:

Perf test with barrier_interval=100ms, checkpoint_freqency=10
Implement state table try_flush for non-checkpoint barrier #12419
Avoid relying on barrier read in executor #13687

The text was updated successfully, but these errors were encountered:

hzxa21 · 2023-11-08T09:17:14Z

We have conducted several rounds of test hoping OOM can be fixed by high-frequency barrier. However, OOM is still present. We have decided to proceed with implementing spilling within a single barrier, which is a more optimal solution than this.

hzxa21 added the type/feature label Sep 18, 2023

hzxa21 self-assigned this Sep 18, 2023

github-actions bot added this to the release-1.3 milestone Sep 18, 2023

wcy-fdu mentioned this issue Oct 10, 2023

temporal filter can emit a large number of messages to downstream in one barrier causing OOM #12715

Closed

hzxa21 modified the milestones: release-1.3, release-1.4 Oct 10, 2023

hzxa21 closed this as completed Nov 8, 2023

hzxa21 mentioned this issue Nov 28, 2023

Avoid relying on barrier read in executor #13687

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable high frequency non-checkpoint barrier without barrier read #12393

Enable high frequency non-checkpoint barrier without barrier read #12393

hzxa21 commented Sep 18, 2023 •

edited

Loading

hzxa21 commented Nov 8, 2023

Enable high frequency non-checkpoint barrier without barrier read #12393

Enable high frequency non-checkpoint barrier without barrier read #12393

Comments

hzxa21 commented Sep 18, 2023 • edited Loading

hzxa21 commented Nov 8, 2023

hzxa21 commented Sep 18, 2023 •

edited

Loading