Skip to content

Commit

Permalink
batch size
Browse files Browse the repository at this point in the history
  • Loading branch information
morningman committed Dec 5, 2024
1 parent d12d3f9 commit 11438bc
Showing 1 changed file with 8 additions and 4 deletions.
12 changes: 8 additions & 4 deletions be/src/vec/exec/scan/scanner_scheduler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -268,8 +268,9 @@ void ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
}

size_t raw_bytes_threshold = config::doris_scanner_row_bytes;
size_t raw_bytes_read = 0; bool first_read = true;
bool has_limit = scanner->limit() > 0;
size_t raw_bytes_read = 0;
bool first_read = true;
int64_t limit = scanner->limit();
while (!eos && raw_bytes_read < raw_bytes_threshold) {
if (UNLIKELY(ctx->done())) {
eos = true;
Expand Down Expand Up @@ -317,12 +318,15 @@ void ScannerScheduler::_scanner_scan(std::shared_ptr<ScannerContext> ctx,
ctx->inc_block_usage(free_block->allocated_bytes());
scan_task->cached_blocks.emplace_back(std::move(free_block), free_block_bytes);
}
if (has_limit) {
// If this scanner has limit, return immediately and no need to wait raw_bytes_threshold.
if (limit < ctx->batch_size()) {
// If this scanner has limit, and less than batch size,
// return immediately and no need to wait raw_bytes_threshold.
// This can save time that each scanner may only return a small number of rows,
// but rows are enough from all scanners.
// If not break, the query like "select * from tbl where id=1 limit 10"
// may scan a lot data when the "id=1"'s filter ratio is high.
// If limit is larger than batch size, this rule is skipped,
// to avoid user specify a large limit and causing too much small blocks.
break;
}
} // end for while
Expand Down

0 comments on commit 11438bc

Please sign in to comment.