Skip to content

fix: Fix sync segments failed due to OOM check in segment loader #41985

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

weiliu1031
Copy link
Contributor

issue: #41984
Fix the issue where segment loading fails unnecessarily due to incorrect memory and disk usage checks. Changes include:

  • Only perform OOM check when there's actual memory/disk cost during segment loading
  • Skip checks when setting segments to delegator doesn't consume resources
  • Rename variables for better readability (maxSegmentSize -> maxSegmentMemSize)
  • Enhance test cases to cover more scenarios with memory and disk usage settings

@sre-ci-robot sre-ci-robot added the size/M Denotes a PR that changes 30-99 lines. label May 21, 2025
@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: weiliu1031
To complete the pull request process, please assign jiaoew1991 after the PR has been reviewed.
You can assign the PR to them by writing /assign @jiaoew1991 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot requested review from bigsheeper and yah01 May 21, 2025 08:56
@mergify mergify bot added dco-passed DCO check passed. kind/bug Issues or changes related a bug labels May 21, 2025
Copy link
Contributor

mergify bot commented May 21, 2025

@weiliu1031 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link

codecov bot commented May 21, 2025

Codecov Report

Attention: Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.

Project coverage is 80.45%. Comparing base (252d49d) to head (83920e2).
Report is 43 commits behind head on master.

Files with missing lines Patch % Lines
internal/querynodev2/segments/segment_loader.go 90.90% 0 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #41985      +/-   ##
==========================================
- Coverage   81.67%   80.45%   -1.22%     
==========================================
  Files        1202     1537     +335     
  Lines      185973   216703   +30730     
==========================================
+ Hits       151886   174358   +22472     
- Misses      27803    36053    +8250     
- Partials     6284     6292       +8     
Components Coverage Δ
Client 79.36% <ø> (ø)
Core 73.10% <ø> (∅)
Go 81.90% <90.90%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
internal/querynodev2/segments/segment_loader.go 69.51% <90.90%> (-0.09%) ⬇️

... and 370 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@weiliu1031 weiliu1031 force-pushed the fix_sync_failed_due_to_oom branch from f9a3a6f to 9a61f66 Compare May 21, 2025 14:21
Copy link
Contributor

mergify bot commented May 21, 2025

@weiliu1031 E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@weiliu1031 weiliu1031 force-pushed the fix_sync_failed_due_to_oom branch from 9a61f66 to 5540dc5 Compare May 22, 2025 02:18
Copy link
Contributor

mergify bot commented May 22, 2025

@weiliu1031 cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Fix the issue where segment loading fails unnecessarily due to incorrect memory and disk usage checks.
Changes include:
- Only perform OOM check when there's actual memory/disk cost during segment loading
- Skip checks when setting segments to delegator doesn't consume resources
- Rename variables for better readability (maxSegmentSize -> maxSegmentMemSize)
- Enhance test cases to cover more scenarios with memory and disk usage settings

Signed-off-by: Wei Liu <[email protected]>
@weiliu1031 weiliu1031 force-pushed the fix_sync_failed_due_to_oom branch from 5540dc5 to 83920e2 Compare May 23, 2025 06:23
@mergify mergify bot added the ci-passed label May 23, 2025
@weiliu1031
Copy link
Contributor Author

sync segment may cost a few memory by delta log, so this pr won't fix the issue

@weiliu1031 weiliu1031 closed this May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-passed dco-passed DCO check passed. kind/bug Issues or changes related a bug size/M Denotes a PR that changes 30-99 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants