Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A local Overflow Pool to cache transactions during high traffic #30567

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

emailtovamos
Copy link

Description

  • During high traffic, transactions are lost as the current legacy pool (Main Pool) has limits on its capacity and propagating them during high traffic also costs network congestions.
  • This PR incorporates an Overflow Pool which stores transactions in memory for such cases and time to time transfers them to the Main Pool when the Main Pool isn't full. It is not propagated or announced while it is in the Overflow Pool. But when it is finally transferred to the Main Pool, at that point it is considered as a transaction that just arrived at the Main Pool and is subject to the usual checks and propagation logic.
  • Overflow Pool's size can be chosen by a node depending on its hardware capacity.

Details

a. Main Pool
There is no new structure for Main Pool, actually it is just the current Transaction Pool (Legacy Pool). When this Pool isn't full, the received new transactions will be broadcasted to all the relevant peers, just the same as the current behaviour.

b. Overflow Pool: the local buffer (LRU or Heap)
When the Main Pool is overflowed during high traffic, then it would put lots of stress on the network if we keep broadcasting new transactions. So we put any new transaction into the Overflow Pool and don’t broadcast or announce.
The size of the Overflow Pool could be very large, in order to hold a large traffic volume, like 500K. Suppose the average transaction size is 500B, it will take around 256MB memory, which is acceptable.

How to flush transactions from Overflow Pool to Main Pool:

  • To satisfy the first-come-first-serve policy, when new transactions arrive & the Main Pool is full, we will push it to the Overflow Pool if it is not full.
  • This transaction that is part of Overflow Pool doesn’t get announced or propagated. Its knowledge and existence is completely local to the particular node.
  • We regularly try to pop new transactions from the Overflow Pool to add to the Main Pool when Main Pool has space. This can either be time-based (e.g. once every 10 seconds) or block based (once every new block or every 2 blocks)
  • When a transaction moves from Overflow Pool to Main Pool, it is treated the same way when a transaction directly arrives in the Main Pool.
  • When a new block is imported, we can try to flush the Overflow Pool to Main Pool.

@holiman
Copy link
Contributor

holiman commented Oct 10, 2024

This doesn't make sense to me. If the pool is full, then arguably it's because we've reached the constraints that are set upon it: the amount of memory we were willing to dedicate to transactions is filled up.

You're saying that we should add a second standalone pool, 256MB of memory? Why not just add more memory to the first pool?

OR, maybe you're saying that during high traffic, our pool is not fast enough to keep up the pace, and transactions are dropped which would otherwise be accepted? And that this would be solved with adding a secondary faster pool? If this is the case, then I think you are wrong, a slow pool should act as backpressure. We have a number of goroutines that fetch transactions from peers, and if the pool is slow, then they simply have to wait longer to deliver their loads.

@karalabe
Copy link
Member

I think the PR is actually a stab at your local pool that piles up local transactions and drips them into the live pool.

@emailtovamos If your goal is to accumulate local transactions above the pool limits, that's something @holiman has a PR about and we're actively want to address. If your goal is to overflow network transactions, that doesn't make sense, they should just be handled by the pool.

@holiman
Copy link
Contributor

holiman commented Oct 10, 2024

I think the PR is actually a stab at your local pool that piles up local transactions and drips them into the live pool.

I think not -- it doesn't differentiate between local and non-local transactions, even though the addToOverflowPool takes it as a parameter, it's just discarded.

@emailtovamos
Copy link
Author

You're saying that we should add a second standalone pool, 256MB of memory? Why not just add more memory to the first pool?

256MB of extra memory or more can be set by the node by setting the limit of the Overflow Pool. So that nodes with less hardware capacity don't have to support it. Whereas nodes with high capacity can afford to increase this memory. It is configurable.
The current globalslots & globalqueue are configurable as well but just setting them higher also means having to support more networking which isn't good during high traffic. As opposed to waiting for few more seconds/blocks and then allowing the buffered transactions to get added to the Main Pool.

e.g. currently if a user sends a transaction and the pool is full, the transaction will likely fail (assuming gas price wasn't high enough) and he will probably send the transaction again after some time. In the above concept, he will simply send it once and wait for maybe the same time eventually but he won't have to do it twice.

@emailtovamos
Copy link
Author

emailtovamos commented Oct 10, 2024

If your goal is to accumulate local transactions above the pool limits, that's something @holiman has a PR about and we're actively want to address.

@karalabe Yes that is in line with my goal with this PR. Good to know this is being addressed. I will check out the existing PR

@zzzckck
Copy link

zzzckck commented Oct 10, 2024

I think the PR is actually a stab at your local pool that piles up local transactions and drips them into the live pool.

@emailtovamos If your goal is to accumulate local transactions above the pool limits, that's something @holiman has a PR about and we're actively want to address. If your goal is to overflow network transactions, that doesn't make sense, they should just be handled by the pool.

This idea is mainly to avoid transaction lost when the current TxPool is overflowed, as transaction lost could bring very bad UX, which is not uncommon during network traffic burst(like inscriptions).

But simply increase the size of current TxPool has some side effects, mainly because it will take more resource to handle these transactions: CPU & Memory & Network. Currently, BSC set TxPool's size to around 10K to 15K.

This new overflow pool wanna to be simple and take very limited resource to cache huge number of transactions during burst traffic, OverflowPool is expected to cache more than 100K transaction when current TxPool is full

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants