-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for dml_rate_limit #19679
Conversation
be07576
to
4627f11
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test for DML rate limit?
Something like:
-
Set dml rate limit on table + parallelism=1.
-
Trigger DML statement in background
system ok risedev psql -c 'insert into ... values ...' &
-
Sleep 10s
-
check number of rows in table ~10.
Run the test only in non-madsim context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM!
let reader = apply_dml_rate_limit( | ||
handle.stream_reader().into_stream(), | ||
self.rate_limiter.clone(), | ||
) | ||
.boxed() | ||
.map_err(StreamExecutorError::from); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just note that the current approach puts the rate limiter in the input of dml executor instead of the output. There are some differences between these two places. Limiting the input means some transaction End
or Rollback
messages might be delayed, and the dml executor actually will buffer the active (uncommitted) transaction (unless it is found that the transaction is too large to buffer). Anyway, it is acceptable to me.
} else { | ||
false | ||
}; | ||
if chunk_size == limit + 1 && ends_with_update { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the limit is 1
and the chunk is [-U, +U]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chunk of size 2 can successfully obtain 1 permit from the rate limiter.
Actually this is the code snippet from #16407. It's been moved to utils.rs for reusability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for now. But IMO dml rate limit should be able to break the atomicity. In some cases, the user may insert a large number of records in a single insert statement.
After all persistence is only guaranteed by flush
.
(cherry picked from commit f0a91df)
Agree.
The atomicity will be broken after processing MAX_CHUNK_FOR_ATOMICITY chunks. Since MAX_CHUNK_FOR_ATOMICITY is hard-coded to 32, I think it's good for now. |
Hi, there. 📝 Telemetry Reminder:
|
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
close #13801
The implementation is similar to SourceExecutor's rate limit.
The key difference is, when altering the DML rate limit, i.e.
Mutation::Throttle
, this PR updates the data stream's rate limiter in place, rather than recreating it as done inSourceExecutor
.SourceExecutor
because it can still read those data from persistent store after recreating.DmlExecutor
because those data chunks have already been removed from the DML data channel.Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.