Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(snapshot-backfill): control log store back pressure in backfill executor #18798

Merged
merged 31 commits into from
Oct 18, 2024

Merge branch 'main' into yiming/snapshot-backfill-executor-backpressure

72274be
Select commit
Loading
Failed to load commit list.
Merged

feat(snapshot-backfill): control log store back pressure in backfill executor #18798

Merge branch 'main' into yiming/snapshot-backfill-executor-backpressure
72274be
Select commit
Loading
Failed to load commit list.
Task list completed / task-list-completed Started 2024-10-18 07:55:15 ago

1 / 8 tasks completed

7 tasks still to be completed

Details

Required Tasks

Task Status
before the barrier to turn ConsumingLogStore to ConsumingUpstream, backfill executors still consume log store even though it has caught up with the upstream, which unnecessarily still reads log store data even though the upstream data are ready. Incomplete
though we have some control mechanism on meta node to ensure that the number of barrier lagging behind gradually decreases, there is no actual back pressure to upstream on streaming executor side. Incomplete
the control state on meta node is too complicated, and can be simplified. Incomplete
after this barrier, the creating job will join the upstream, and all later barriers are injected and collected together Incomplete
on receiving this barrier, the upstream mv executor will stop writing log store old value, which is the same as the previous implementation. Incomplete
snapshot backfill executor will not be aware of this barrier, because it no longer receives control information from meta node after finishing consuming snapshot. Incomplete
For this barrier, it is injected and collected independently between upstream and the creating job. However, the upstream and the creating job will commit this epoch together, which means the upstream will wait for the creating job to collect this barrier before committing the epoch. After this epoch is committed, the creating job will be marked as created, and the snapshot backfill is finished. Incomplete
I have written necessary rustdoc comments Incomplete
I have added necessary unit tests and integration tests Incomplete
I have added test labels as necessary. See details. Incomplete
I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features #7934). Incomplete
My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future). Incomplete
All checks passed in ./risedev check (or alias, ./risedev c) Completed
My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details) Incomplete
My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users) Incomplete
Understand the implications of revoking this secret by investigating where it is used in your code. Incomplete
Replace and store your secrets safely. Learn here the best practices. Incomplete
Revoke and rotate these secrets. Incomplete
If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data. Incomplete
following these best practices for managing and storing secrets including API keys and other credentials Incomplete
install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation. Incomplete
During backfilling, it will inject one fake barrier Incomplete
On starting log store consumption, it will inject one or more real (but stale) barriers Incomplete
Collected fake barrier for creating jobs Incomplete
Collected normal barrier for createing jobs Incomplete
Collected normal barrier for created jobs Incomplete
Resets the completing barrier. Incomplete
Enforces that the completed barrier has the same epoch as the barrier control of the streaming job. Incomplete
If max_collected_epoch is larger than backfill epoch, means that backfill has already started consuming logstore, and has at least finished reading logstore from max_collected_epoch. Incomplete
If backfill epoch is larger, means that backfill has not started logstore read. We need to pin the backfill epoch, as that's where it will start logstore read from. Incomplete
What triggers the state transition. Example: backfill finish triggers ConsumingSnapshot -> ConsumingLogStore Incomplete
What actions are fired on each state transition. Example: inject all upstream pending barriers on ConsumingSnapshot -> ConsumingLogStore Incomplete