Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E35 prune small batches #9088

Merged
merged 83 commits into from
Jan 16, 2024
Merged

E35 prune small batches #9088

merged 83 commits into from
Jan 16, 2024

Conversation

awskii
Copy link
Member

@awskii awskii commented Dec 27, 2023

Follow up to #9031 (comment)

Ordering

AggregatorV3Context pruning is happening in following order:

  1. Index pruning started from lowest txNum such that txFrom <= txn <= txTo. Progress is going towards bigger txNumbers.
  2. Therefore, History pruning goes in same direction and happens along with key pruning via callback.
  3. Domain pruning starts from Latest() key which is the biggest key available. We use inverted steps (^step) as a suffix for domain keys which gives us an opportunity to prune smallest steps first. So, from largest available key and smallest available step going backwards to bigger steps and smaller keys. If for given key we met savedStep > pruneStep we safely going to PrevNoDup() key without scanning and skipping steps.

Limiting

Pruning progress obviously changes state therefore affects execution - invalid reads of obsolete values could happen if pruning is broken.
Pruning indices and histories is coupled, since history table is bounded to index key and txn entries. Since index is a mapping txNum -> {key, key', ...}, looks easier to limit their pruning by txNums at once instead of going through whole list selecting by limit keys.
AggregatorV3Context.PruneSmallBatches() always set txFrom=0 since it's purpose to keep db clean but one step at a time.

domain pruning is limited by amount of keys removed at once. For slow disks and big db (>150G) domain pruning could be very slow: Database keep growing, slowing down pruning as well to 100.000 kv's per 10min session which is not enough to keep db of a constant size. So, using smaller values for --batchSize could solve the problem due to more frequent call of Prune and small changes put into db.

Domain can be pruned if savedPruneProgress key is not for this table nil, or smallest domain key has values of savedStep < pruneStep in domain files. The downside of looking up onto smallest step is that smallest key not guaranteed to be changed in each step which could give us invalid estimate on smallest available key. Saved prune progress indicates that we did not finished latest cleanup but does not give us step number. Could be used meta tables which would contain such an info (smallest step in table?).

takeouts, keep in mind

  • --batchSize should be smaller on slower disks (even of size 16-64M) to keep db small. Balanced batchSize could increase throughput preserving db size.
  • We have some internal functions which relies on this ordering like define available steps in db
  • When --batchSize is reached, commitment evaluated and puts update into that batch which becomes x1.4-x2 of size

defer cancel()

for {
if err := ac.Prune(context.Background(), tx, 100); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need check that prune with limit is “consistent” - for example Index.Prune doesn ‘t exit before collector.Load (if limit reached).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for pointing on it

@awskii
Copy link
Member Author

awskii commented Dec 28, 2023

this pulled up few problems with context management and limits as well.

@awskii
Copy link
Member Author

awskii commented Dec 28, 2023

Since limit is decreased after each hit of prunable key, we sometimes end up with no prune progress over history at all (when limit is a small number). For now i split variables and both domains and history prunes limit keys each.

Did not decided yet if prune continuation still make sense or it's better to be removed right now, need to make a few more comparisons.

Also this PR raises the question: do we want to prune domains index tables as well (which looks meaningful) or just prune it during Unwind process only.

@AskAlexSharov
Copy link
Collaborator

  1. limit - i think just need change meaning of this variable. avoid it's mutation - and pass same limit to all domains/history/indices.
  2. do we want to prune domains index tables - don't understand this question. we wanna prune everything I guess.

break

txNum := binary.BigEndian.Uint64(txnm)
if txNum < stat.MinTxNum {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't understand - does this code lost [from, to) semantic or not

@awskii awskii changed the title E35 [wip] prune small E35 prune small batches Jan 16, 2024
@awskii awskii merged commit 40f8b12 into e35 Jan 16, 2024
7 checks passed
@awskii awskii deleted the e35_prune_small branch January 16, 2024 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants