Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] split chunk of HashTable #51175

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

murphyatwork
Copy link
Contributor

@murphyatwork murphyatwork commented Sep 19, 2024

Why I'm doing:

@          0x2f9d4b5  malloc
@          0x8f9c745  operator new()
@          0x2ddb2ee  std::vector<>::_M_range_insert<>()
@          0x2dde914  starrocks::BinaryColumnBase<>::append()
@          0x365fa4e  starrocks::NullableColumn::append()
@          0x37458f9  starrocks::JoinHashTable::append_chunk()
@          0x3c86e80  starrocks::HashJoinBuilder::append_chunk()
@          0x3c8100c  starrocks::HashJoiner::append_chunk_to_ht()
@          0x3ab6649  starrocks::pipeline::HashJoinBuildOperator::push_chunk()
@          0x3a6769c  starrocks::pipeline::PipelineDriver::process()
@          0x3a58b9e  starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@          0x305ebac  starrocks::ThreadPool::dispatch_thread()
@          0x305882a  starrocks::Thread::supervise_thread()

JoinHashTable::build_chunk is a Chunk which contains all data from build side, it means it can be very large for particular cases. As a result, it can easily encounter the memory allocation issue, when jemalloc/os cannot allocate a large continuous memory, as above exception.

The particular cases can be:

  • use string column as build side
  • use array column as build side

What I'm doing:

Split that chunk into multiple smaller segments(whose rows is usually 131072) to get rid of this issue:

  • Introduce a SegmentedChunk and SegmentedColumn to replace original Chunk and Column
  • They're not transparent replacement, but implemented most of required interfaces. So minimal code changes are required
  • To deal with the address problem(map the global offset to segment offset): we choose to translate the index just-in-time, like offset%segment_size, rather than maintaining a index for it. It's effective enough with static segment_size.
  • We use static segment_size rather than dynamic, which is easier to implement and more efficient

Potential downside and considerations of this approach:

  • When generate output for JoinHashMap, it needs to randomly copy data from the build_chunk according to build_index. With SegmentedChunk, since the memory address is not continuous anymore, we need to lookup the segment first then lookup the record in it. To deal with it, we try best to use the SegmentedChunkVisitor to reduce this overhead via eliminating the virtual function call
  • The key_column of JoinHashMap cannot not use columns of build_chunk anymore. Since their memory layout is different, key_column use a continuous column, but build_chunk uses a segmented way. It would introduce some memory overhead and memory copy overhead.
    • Why not make the key_column segmented ? The overhead is relatively larger for the probe procedure, and also it needs to change a lot of code, which is beyond the scope. So we choose the easy path

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

Signed-off-by: Murphy <[email protected]>
const std::vector<ChunkPtr>& SegmentedChunk::segments() const {
return _segments;
}

} // namespace starrocks
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
Dereferencing a null pointer when appending chunks, as _segments might contain null values.

You can modify the code like this:

void SegmentedChunk::append_chunk(const ChunkPtr& chunk, const std::vector<SlotId>& slots) {
    if (_segments.empty() || !_segments[0]) {
        _segments.resize(1);
        _segments[0] = std::make_shared<Chunk>();
    }
    
    ChunkPtr open_segment = _segments[_segments.size() - 1];
    size_t append_rows = chunk->num_rows();
    size_t append_index = 0;
    while (append_rows > 0) {
        size_t open_segment_append_rows = std::min(_segment_size - open_segment->num_rows(), append_rows);
        for (int i = 0; i < slots.size(); i++) {
            SlotId slot = slots[i];
            ColumnPtr column = chunk->get_column_by_slot_id(slot);
            open_segment->columns()[i]->append(*column, append_index, open_segment_append_rows);
        }
        append_index += open_segment_append_rows;
        append_rows -= open_segment_append_rows;
        if (open_segment->num_rows() == _segment_size) {
            _segments.emplace_back(std::make_shared<Chunk>());
            open_segment = _segments[_segments.size() - 1]; // Ensure open_segment points to the new segment.
        }
    }
}

Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
@murphyatwork murphyatwork force-pushed the murphy_opt_split_build_chunk branch 3 times, most recently from 8afc5c2 to dcd84b6 Compare September 21, 2024 02:40
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[BE Incremental Coverage Report]

fail : 111 / 223 (49.78%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/exec/pipeline/hashjoin/spillable_hash_join_build_operator.cpp 0 1 00.00% [262]
🔵 be/src/exec/join_hash_map.h 0 1 00.00% [845]
🔵 be/src/column/column_helper.cpp 1 18 05.56% [488, 489, 491, 492, 493, 495, 496, 497, 500, 501, 502, 503, 506, 507, 508, 510, 511]
🔵 be/src/exec/join_hash_map.tpp 2 8 25.00% [689, 692, 708, 709, 714, 716]
🔵 be/src/storage/chunk_helper.cpp 96 179 53.63% [662, 663, 664, 665, 666, 667, 668, 669, 671, 672, 693, 698, 699, 700, 701, 702, 740, 741, 742, 747, 748, 749, 750, 752, 755, 759, 760, 761, 763, 790, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 811, 813, 814, 815, 816, 817, 819, 820, 821, 822, 823, 824, 825, 826, 829, 838, 839, 840, 841, 843, 846, 847, 848, 849, 851, 854, 855, 869, 870, 871, 873, 885, 886, 889, 890, 896, 897, 898, 900]
🔵 be/src/exec/join_hash_map.cpp 9 13 69.23% [644, 645, 650, 653]
🔵 be/src/exec/spill/mem_table.cpp 1 1 100.00% []
🔵 be/src/storage/chunk_helper.h 2 2 100.00% []

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant