Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chainHead/fix: Report bestBlock events only for newBlock reports #5527

Merged
merged 16 commits into from
Sep 3, 2024

Conversation

lexnv
Copy link
Contributor

@lexnv lexnv commented Aug 29, 2024

The #5512 has surfaced that we reported a BestBlock event for a block not previously reported via NewBlock.

This is because of a race between:

  • the stream of events that announces new blocks
  • self.client.info().best_block

It is possible that client.info() contains newer information than the information polled from the block stream (that may be lagging).

To mitigate this, instead of relying on the client's info use the last finalized block to emit a new event.

There are two cases when a new best block event is emitted:

  • The best block is in the pruned list and is reported immediately
  • The best block is not a descendant of the last finalized block

Closes: #5512

Thanks @jsdw and @josepot for helping debug this 🙏

cc @paritytech/subxt-team

@lexnv lexnv added I2-bug The node fails to follow expected behavior. T3-RPC_API This PR/Issue is related to RPC APIs. D0-easy Can be fixed primarily by duplicating and adapting code by an intermediate coder. labels Aug 29, 2024
@lexnv lexnv self-assigned this Aug 29, 2024
@josepot
Copy link

josepot commented Aug 29, 2024

First of all, thank you for your quick response and for addressing this so promptly!

I’d like to clarify a few points regarding the generation of multiple BestBlock events for the same block after this PR:

After this PR, multiple BestBlock events may be generated for the same block. The RPC-v2 spec does not mention that the BestBlock must be a descendant of the previously generated BestBlock, because it can simply be on a different fork. Therefore, it is ok to generate multiple BestBlock events for the same block.
However, the spec mentions that a BestBlock event must be generated after a NewBlock event.

I’m not entirely sure I fully understood your intent here, but I have some concerns. It's important to note that the actual name of the event is bestBlockChanged. In my opinion, the exact same event should only be emitted multiple times if there is a valid reason -specifically, if the best block has genuinely changed, such as due to a switch between different branches.

While the spec doesn't explicitly prohibit emitting the same event multiple times for an unchanged best block, I believe it's not in the spirit of the spec to do so. The event’s name, bestBlockChanged, suggests that it should only be triggered when there is an actual change. Emitting redundant or noisy events when the best block hasn’t changed could be misleading and I think it's something we should avoid.

That said, I believe this behavior wouldn’t break anything in PAPI, but it could break things for other libraries, as the spec implies that the event is meant to signal an actual change in the best block:

The bestBlockChanged event indicates that the given block is now considered to be the best block.

bestBlockHash is guaranteed to be equal either to the current finalized block hash, or to a block reported in an earlier newBlock event.

Also, the spec states:

This list of notifications makes it very easy for a JSON-RPC client to follow just the best block updates (listening to just bestBlockChanged events).

Given these points, it seems reasonable to expect that bestBlockChanged events are only emitted when there is a meaningful change. Generating this event without a genuine change could potentially create unnecessary noise for clients monitoring these updates.

cc @tomaka

@tomaka
Copy link
Contributor

tomaka commented Aug 30, 2024

It's important to note that the actual name of the event is bestBlockChanged. In my opinion, the exact same event should only be emitted multiple times if there is a valid reason -specifically, if the best block has genuinely changed, such as due to a switch between different branches.

I don't think we should focus too much on the name. All APIs always have small prints that aren't conveyed by the name of the thing itself.
Yes, emitting multiple identical events in a row contradicts the word "changed" and technically it is misleading, but I really don't think that this "misleading-ness" a big deal.

Whether it is allowed to emit multiple identical events in a row should be explicitly written in the spec.
Given that the best block can change at any time and in unpredictable ways, I can't manage to find a situation where it would be troublesome for the JSON-RPC client if the best block changed to what it already is, and so it should IMO simply be allowed but slightly discouraged.

@lexnv
Copy link
Contributor Author

lexnv commented Aug 30, 2024

Have modified the code logic to take care of the following edge-cases:

  • The best block is in the pruned list and is reported immediately
  • The best block is not a descendant of the last finalized block

We now should not emit extra BestBlock events if the current best block is a descendant of the finalized block. Thanks for the review everyone 🙏

substrate/client/rpc-spec-v2/src/chain_head/tests.rs Outdated Show resolved Hide resolved
prdoc/pr_5527.prdoc Outdated Show resolved Hide resolved
prdoc/pr_5527.prdoc Outdated Show resolved Hide resolved
@niklasad1
Copy link
Member

bot fmt

@command-bot
Copy link

command-bot bot commented Sep 3, 2024

@niklasad1 https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7233685 was started for your command "$PIPELINE_SCRIPTS_DIR/commands/fmt/fmt.sh". Check out https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/pipelines?page=1&scope=all&username=group_605_bot to know what else is being executed currently.

Comment bot cancel 4-87d08829-fd10-4af6-94e0-3048bd7c6e0e to cancel this command or bot cancel to cancel all commands in this pull request.

@command-bot
Copy link

command-bot bot commented Sep 3, 2024

@niklasad1 Command "$PIPELINE_SCRIPTS_DIR/commands/fmt/fmt.sh" has finished. Result: https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7233685 has finished. If any artifacts were generated, you can download them from https://gitlab.parity.io/parity/mirrors/polkadot-sdk/-/jobs/7233685/artifacts/download.

@lexnv lexnv enabled auto-merge September 3, 2024 11:49
@lexnv lexnv added this pull request to the merge queue Sep 3, 2024
Merged via the queue into master with commit 325df54 Sep 3, 2024
186 of 187 checks passed
@lexnv lexnv deleted the lenxv/fix-chainhead-best-block branch September 3, 2024 16:48
//
// However, we double check if the best block is a descendant of the last finalized
// block to ensure we don't miss any events.
let ancestor = sp_blockchain::lowest_common_ancestor(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we need this. If the best block was not pruned, it needs to be a descendant of the finalized block.

lexnv added a commit that referenced this pull request Sep 4, 2024
The #5512 has surfaced
that we reported a `BestBlock` event for a block not previously reported
via `NewBlock`.

This is because of a race between:
- the stream of events that announces new blocks
- `self.client.info().best_block`

It is possible that `client.info()` contains newer information than the
information polled from the block stream (that may be lagging).

To mitigate this, instead of relying on the client's info use the last
finalized block to emit a new event.

There are two cases when a new best block event is emitted:
- The best block is in the pruned list and is reported immediately
- The best block is not a descendant of the last finalized block 

Closes: #5512 

Thanks @jsdw and @josepot for helping debug this 🙏 

cc @paritytech/subxt-team

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
Co-authored-by: command-bot <>
x3c41a pushed a commit that referenced this pull request Sep 4, 2024
The #5512 has surfaced
that we reported a `BestBlock` event for a block not previously reported
via `NewBlock`.

This is because of a race between:
- the stream of events that announces new blocks
- `self.client.info().best_block`

It is possible that `client.info()` contains newer information than the
information polled from the block stream (that may be lagging).

To mitigate this, instead of relying on the client's info use the last
finalized block to emit a new event.

There are two cases when a new best block event is emitted:
- The best block is in the pruned list and is reported immediately
- The best block is not a descendant of the last finalized block 

Closes: #5512 

Thanks @jsdw and @josepot for helping debug this 🙏 

cc @paritytech/subxt-team

---------

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
Co-authored-by: command-bot <>
EgorPopelyaev pushed a commit that referenced this pull request Sep 4, 2024
…eports (#5527) (#5582)

This backports original PR:
#5527 to the release
branch


```

The #5512 has surfaced that we reported a `BestBlock` event for a block not previously reported via `NewBlock`.

This is because of a race between:
- the stream of events that announces new blocks
- `self.client.info().best_block`

It is possible that `client.info()` contains newer information than the information polled from the block stream (that may be lagging).

To mitigate this, instead of relying on the client's info use the last finalized block to emit a new event.

There are two cases when a new best block event is emitted:
- The best block is in the pruned list and is reported immediately
- The best block is not a descendant of the last finalized block 

Closes: #5512 

Thanks @jsdw and @josepot for helping debug this 🙏 

cc @paritytech/subxt-team
```

Signed-off-by: Alexandru Vasile <[email protected]>
Co-authored-by: Sebastian Kunert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
D0-easy Can be fixed primarily by duplicating and adapting code by an intermediate coder. I2-bug The node fails to follow expected behavior. T3-RPC_API This PR/Issue is related to RPC APIs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JSON-RPC messages arrive out of order
7 participants