Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbage-collect batch DB #502

Merged

Conversation

matejpavlovic
Copy link
Contributor

The Batch DB also needs to be garbage-collected.
The current implementation stores all batches ever delivered in memory and thus runs out of memory rather quickly under high load.

Copy link
Contributor

@ranchalp ranchalp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

state.batchStore[batchID] = &b
state.batchesByRetIdx[retIdx] = append(state.batchesByRetIdx[retIdx], batchID)
}

batchdbpbdsl.BatchStored(m, origin.Module, origin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems weird that if too old we do not store it but we say we have stored it. Have you considered whether this could break somewhere else? Probably not because if then it is not stored we just ask around for it. However, intuitively, this should never happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, very good point, I was actually thinking about it the very same way and wanted to write an explanatory comment in the code, but I forgot do do that.
I agree that it's a bit counter-intuitive. The thing is that if the BatchDB's retention index exceeded the retention index of the batch, it's equivalent to the batch having been stored and already garbage-collected. In fact, this might even really happen, when the batch is concurrently being stored by the availability module's dissemination and retrieval functions (the dissemination to this particular node only needs to be slightly late). Both of these functions need to believe to have stored the batch in order to properly continue (although it's likely that nobody cares about their output any more, as they are being garbage-collected too).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the explanatory comment, thanks for pointing it out.

@matejpavlovic matejpavlovic merged commit 201930e into consensus-shipyard:main Jul 26, 2023
4 checks passed
@matejpavlovic matejpavlovic deleted the fakebatchdb-gc-rebased branch July 31, 2023 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants