Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove optimization on blk free operation #645

Merged
merged 1 commit into from
Feb 14, 2025

Conversation

Besroy
Copy link
Contributor

@Besroy Besroy commented Feb 11, 2025

The optimization in blk free may cause the following issue, just remove it and wait for GC handling:

  • T1: blob1 is written with LSN 1 — [blkid=10, chunk=1, cnt=5].
  • T2: blob1 is deleted with LSN 10, causing the last_append_offset to revert to 10.
  • T3: blob2 is written with LSN 11 — [blkid=10, chunk=1, cnt=5].
  • T4: The SM is terminated and restarted.
  • T5: LSN 1 is replayed, committing block [blkid=10, chunk=1, cnt=5].
  • T6: LSN 11 is replayed, committing block [blkid=10, chunk=1, cnt=5].
  • T7: LSN 10 is committed, freeing block [blkid=10, chunk=1, cnt=5].
  • T8: LSN 11 is committed again, but since the blocks have already been freed, they are not available for LSN 11.

@codecov-commenter
Copy link

codecov-commenter commented Feb 11, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 66.10%. Comparing base (1a0cef8) to head (ab44065).
Report is 135 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #645      +/-   ##
==========================================
+ Coverage   56.51%   66.10%   +9.58%     
==========================================
  Files         108      109       +1     
  Lines       10300    10989     +689     
  Branches     1402     1505     +103     
==========================================
+ Hits         5821     7264    +1443     
+ Misses       3894     3012     -882     
- Partials      585      713     +128     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

JacksonYao287
JacksonYao287 previously approved these changes Feb 12, 2025
Copy link
Contributor

@JacksonYao287 JacksonYao287 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let gc reclaim the space.

@xiaoxichen xiaoxichen requested a review from yamingk February 12, 2025 09:49
@xiaoxichen
Copy link
Collaborator

LGTM, @yamingk pls take a look.

It is very unlikely we delete/free on the tail so I am fine to leave it to GC.

@yamingk
Copy link
Contributor

yamingk commented Feb 12, 2025

@xiaoxichen , can we pull out @hkadayam 's PR (maybe you guys have already done it?) to make sure the original purpose of this change is not broken and if is broken, it is expected with GC?
I remember this change in free is critical to avoid corruption during crash restart.

@Besroy
Copy link
Contributor Author

Besroy commented Feb 13, 2025

@xiaoxichen , can we pull out @hkadayam 's PR (maybe you guys have already done it?) to make sure the original purpose of this change is not broken and if is broken, it is expected with GC? I remember this change in free is critical to avoid corruption during crash restart.

@yamingk I think the purpose not broken. PR #403 addresses the crash recovery process by introducing reserve_on_cache and reserve_on_disk. The former only happens during recovery, while the latter introduces the concept of m_commit_offset. The intent to rewind m_last_append_offset for the last block is just an optimiaztion for special case (free at the tail), which has been in place for some time. It seems that PR #403 just optimizes that part of the code . FYI please see this comment.

The optimization in blk free may cause the following issue, just remove it and wait for GC handling:
- T1: blob1 is written with LSN 1 — [blkid=10, chunk=1, cnt=5].
- T2: blob1 is deleted with LSN 10, causing the last_append_offset to revert to 10.
- T3: blob2 is written with LSN 11 — [blkid=10, chunk=1, cnt=5].
- T4: The SM is terminated and restarted.
- T5: LSN 1 is replayed, committing block [blkid=10, chunk=1, cnt=5].
- T6: LSN 11 is replayed, committing block [blkid=10, chunk=1, cnt=5].
- T7: LSN 10 is committed, freeing block [blkid=10, chunk=1, cnt=5].
- T8: LSN 11 is committed again, but since the blocks have already been freed, they are not available for LSN 11.
@Besroy Besroy merged commit fb28db4 into eBay:master Feb 14, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants