-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: gc for zombie piece & metaTask & bucket migration #1190
Conversation
7f39fe6
to
197925f
Compare
298ba56
to
1188bae
Compare
4ca9246
to
0a63e38
Compare
0a63e38
to
67c97f3
Compare
1c05242
to
a2e18f0
Compare
4614ebb
to
3d180ca
Compare
modular/executor/execute_task.go
Outdated
} | ||
|
||
func (e *ExecuteModular) HandleGCBucketMigrationBucket(ctx context.Context, task coretask.GCBucketMigrationTask) { | ||
// TODO gc progress persist in db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a TODO here, which will be solved in the next PR.
In the event of a failure or crash of the BucketMigrationBucket, it will lead to the inability to proceed with data GC tasks. This will be addressed in the future by reusing the state of the bucketMigrateTable to drive the process and record the GC status.
b2d6cdb
to
43dce90
Compare
43dce90
to
acfe31b
Compare
d3673da
to
efd48e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
} | ||
}() | ||
|
||
if includePrivate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we don't have to distinguish private or public objects here, as SP need gc for all zombie pieces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will remove the handling logic related to includePrivate=false.
LGTM |
Description
GCZombiePiece
GCZombiePieceTask is an abstract interface to record the information for collecting the piece store space by deleting zombie pieces data that dues to any exception, the piece data meta is not on chain but the pieces have been stored in piece store.
GCMeta
GCMetaTask is an abstract interface to record the information for collecting the SP meta store space by deleting the expired data.
GC for Bucket Migration
When bucket migration is completed or has failed, we need to delete redundant data to free up space:
Implementation of the GCZombie
GCConfig::EnableGCZombie
: Enables or disables the GCZombie feature.GCConfig::GCZombiePieceTimeInterval
: Time interval for generating GCZombie tasks (default:DefaultGlobalBatchGcZombiePieceTimeInterval
- 10*60 seconds).ParallelConfig::GlobalGCZombieParallel
: Maximum allowed parallel GCZombie tasks (default: 1).GCZombieSafeObjectIDDistance
: A reserve of object IDs during GCZombie deletion. If the scanned object ID range plusDefaultGlobalGcZombieSafeObjectIDDistance
exceeds the current maximum system ID, the scan resumes from 0.GCZombiePieceObjectIDInterval
: Interval between generated object IDs for each GCZombie task. For example, task 1 might handle IDs 0-100, task 2 101-200, and so on (default:DefaultGlobalGcZombiePieceObjectIDInterval
- 100).Core Processes
GCZombiePieceTask
: Triggered bygcZombiePieceTicker
resulting in the creation of GCZombiePieceTask.GCZombiePieceTask
:ExecuteModular::HandleGCZombiePieceTask
.gcZombiePieceFromIntegrityMeta
: Determines whether a piece is a ZombiePiece based on theIntegrityMeta
table. Scans allIntegrityMeta
within the current object ID range specified in GCZombiePieceTask (StartObjectId
,EndObjectId
).gcZombiePieceFromPieceHash
: Determines whether a piece is a ZombiePiece based on thePieceHash
table. Scans allPieceHash
within the current object ID range specified in GCZombiePieceTask (StartObjectId
,EndObjectId
).GCZombiePieceTask
:ManageModular::HandleGCZombiePieceTask
.gcZombieQueue
.Implementation of the Meta GC
Mainly for the GC (Garbage Collection) of different GC meta, currently periodically deleting expired data from two tables:
bucketTraffic
andreadRecord
.New Configurations:
bucketTrafficKeepLatestDay
: Configured inExecutorConfig::BucketTrafficKeepTimeDay
, defaults toDefaultExecutorBucketTrafficKeepTimeDay
(retaining the latest 180 days).readRecordKeepLatestDay
: Configured inExecutorConfig::ReadRecordKeepTimeDay
, defaults toDefaultExecutorReadRecordKeepTimeDay
(retaining the latest 30 days).DefaultGlobalGCMetaTimeInterval
: Configures the time interval for the generation of GC Meta Tasks, set to 10 * 60 (equivalent to 10 minutes).ReadRecordDeleteLimit
: Maximum number of records to be deleted from the read record table in each gc meta task. (default 100)GCConfig::EnableGCMeta
: Controls whether GCMeta is enabled or not.SQLDBConfig::EnableTracePutEvent
: Controls whether Trace Put Event is enabled or not.Core Process:
![image](https://private-user-images.githubusercontent.com/11239387/281995133-be253b56-e3b6-4584-8a8d-ad98f7192e41.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkwNDA2NjIsIm5iZiI6MTczOTA0MDM2MiwicGF0aCI6Ii8xMTIzOTM4Ny8yODE5OTUxMzMtYmUyNTNiNTYtZTNiNi00NTg0LThhOGQtYWQ5OGY3MTkyZTQxLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMDglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjA4VDE4NDYwMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTA0MWQ2YmUwZDAxOWJhMTgyZjRmYjE0NGYzNzAwMWM4YTlhZjQ1Y2ExOTc3ZGRkMjZiMmJiYzI2YjAzNzI4M2QmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.rgE0s1AOwFROkxR4LbVj44KYx68459mJFbfht2mRjpI)
Triggered by
gcMetaTicker
, resulting in the creation ofGCMetaTask
.Handled in
ExecuteModular::HandleGCMetaTask
gcMetaBucketTraffic
: Deletes entries fromBucketTraffic
usingSpDBImpl::DeleteAllBucketTrafficExpired
for expiredBucketTrafficTable
.gcMetaReadRecord
: Deletes entries fromReadRecord
usingSpDBImpl::DeleteAllReadRecordExpired
for expiredReadRecord
table.Implementation of the BucketMigration GC
Purpose:
New Configurations:
gcBucketMigrationTimeout
(MinGCBucketMigrationTime
0.5-1 hour): Timeout duration for the task.gcBucketMigrationRetry
(MinGCBucketMigrationRetry
3-5 times): Retry attempts for the task.GfSpGCBucketMigrationTask
.TypeTaskGCZombiePiece
. Adjusted priority:TypeTaskGCMeta
→DefaultSmallerPriority / 4
TypeTaskGCBucketMigration
→DefaultSmallerPriority / 4
Core Workflow:
GfSpClient::NotifyPostMigrateBucket
→GfSpNotifyPostMigrate
→ManageModular::NotifyPostMigrateBucket
.ManageModular::GenerateGCBucketMigrationTask
generates aGCBucketMigrationTask
.ManageModular::gcBucketMigrationQueue
.BucketMigrateScheduler
::PostMigrateBucket
generates aGCBucketMigrationTask
on the destination node if migration fails.ExecuteModular::HandleGCBucketMigrationBucket
.ListObjectsByGVGAndBucketForGC
interface.gcBucketMigrationQueue
.Changes
Notable changes: