-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gossipsub extension for Epidemic Meshes (v1.2.0) #413
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Episub - Gossipsub Extension | ||
|
||
|
||
# Overview | ||
|
||
This document aims to provide a minimal extension to the [gossipsub v1.1](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.1.md) protocol, that supersedes the previous [episub](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/episub.md) proposal. | ||
|
||
The proposed extensions are backwards-compatible and aim to enhance the efficiency (minimize amplification/duplicates and decrease message latency) of the gossip mesh networks by dynamically adjusting the messages sent by mesh peers based on a local view of message duplication and latency. | ||
|
||
In more specific terms, two new control messages are introduced, `CHOKE` and `UNCHOKE`. When a Gossipsub router is receiving many duplicates on a particular mesh, it can send a `CHOKE` message to it's mesh peers that are sending duplicates slower than its fellow mesh peers. Upon receiving a `CHOKE` message, a peer is informed to no longer propagate mesh messages to the sender of the `CHOKE` message, rather lazily (in every heartbeat) send it's gossip. | ||
|
||
A Gossipsub router may notice that it is receiving messages via gossip from it's `CHOKE`'d peers faster than it receives them via the mesh. In this case it may send an `UNCHOKE` message to the peer to inform the peer to resume propagating messages in the mesh. | ||
The router may also notice that it is receiving messages from gossip from peers not in the mesh (fanout) faster than it receives messages in the mesh. It may then add these peers into the mesh. | ||
|
||
The modifications outlined above intend to optimize the Gossipsub mesh to receive minimal duplicates from peers with the lowest latency. | ||
|
||
# Specification | ||
|
||
## Protocol Id | ||
|
||
Nodes that support this Gossipsub extension should additionally advertise the version number `2.0.0`. Gossipsub nodes can advertise their own protocol-id prefix, by default this is `meshsub` giving the default protocol id: | ||
AgeManning marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- `/meshsub/2.0.0` | ||
|
||
## Parameters | ||
|
||
This section lists the configuration parameters that control the behaviour of the mesh optimisation. | ||
|
||
|
||
| Parameter | Description | Reasonable Default | | ||
| -------- | -------- | -------- | | ||
| `D_non_choke` | The minimum number of peers in a mesh that must remain unchoked. | `D_lo` | | ||
| `choke_heartbeat_interval` | The number of heartbeats before assessing and applying `CHOKE`/`UNCHOKE` control messages and adding `D_max_add`. | 20 | | ||
| `D_max_add` | The maximum number of peers to add into the mesh (from fanout) if they are performing well per `choke_heartbeat_interval`. | 1 | | ||
AgeManning marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| `choke_duplicates_threshold` | The minimum number of duplicates as a percentage of received messages a peer must send before being eligible of being `CHOKE`'d. | 60 | | ||
| `choke_churn` | The maximum number of peers that can be `CHOKE`'d or `UNCHOKE`'d in any `choke_heartbeat_interval`. | 2 | | ||
|` unchoke_threshold` | Determines how aggressively we unchoke peers. The percentage of messages that we receive in the `choke_heartbeat_interval` that were received by gossip from a choked peer. | 50 | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shouldnt this be higher than choke_threshold? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I dont have a feel for this. They are measuring different things. I think the duplicates we get from a specific node is fairly dependent on the topology of the network. A triangle-like network would see more duplicates from specific peers (I think). Once choked, there is a race condition between how fast we get messages from the mesh (which is like a single round trip) vs the request/response from an IWANT (which I think would be more correlated with latency of the node). I don't see a direct relationship between these thresholds or a feel for which should be higher than the other. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hrm, you are right, this is not clear what these values should be, lets look at some data. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I hope you don’t mind if I add some considerations. In general, and without choking, each message travels at least once on each mesh link. It would be exactly one if there would not be propagation delay, but there is, and it will become even bigger with large messages. If node A receives a duplicate from B, that also means that:
In other words, this relation is symmetric. If A is receiving a duplicate from B, then B will also receive a duplicate from A. At least I think no implementations aborts the sending of queued messages once there is a message from the other side. They mostly can’t even do it because queuing is in-network. This might be an issue for the absolute percentage based choke_duplicates_threshold’, because the peers would race to choke each other. But you are speaking of “largest average latency” in the selection. What latency do you mean here? As for the ‘unchoke_threshold’: There are I think two things to look at when evaluating who to unchoke:
The difficulty is that IHAVE messages provide a largely delayed view compared to actual message reception. Maybe looking at the freshest messages only in IHAVEs could be an indicator for the second. For the first, one could look at the IHAVEs received, or the messages received as a consequence of IWANTs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @cskiraly - Thanks for the extra considerations. In regards to the sources of duplicates, I think its a little tricky and I'm not sure the relation is symmetric. I think there are three main reasons we see duplicates:
Consider 4 nodes:
A is connected to B and C who in turn are connected to D. Lets say A constantly publishes, it sends to B and C. B and C will then send to D. Lets say on average one path is faster, A->B->D. D will receive a message from B before C. Depending on propagation speed and validation time etc it will then send to C. If this timing is consistently faster than A-> C and C's validation time, then C will be constantly receiving duplicates from D (and should not respond by sending a message back, making this not symmetric).
I was thinking that for all the duplicates we see, we measure the time it took for it to reach us measured relative to the first message sent to us. I assume this will create a somewhat normal distribution of timing amongst our peers. If a few peers are eligible for choking we preference them by the average time of messages they sent to us. i.e if most of the time one peer is sending us a duplicate 700ms later than the first message, when another is sending us duplicates 100ms later, we choke the one sending 700ms later on average.
Yes. I agree. IHAVE's are very slow. They are stochastic also, in that we randomly select peers to send them to in the heartbeat. The speed is very much related to heartbeat timing and gossip_lazy parameters. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Right, I've forgot about validation delay. It is indeed there, but I had a reason to forget about it. We have added duplicate metrics in nim-libp2p and thus in Nimbus, and it shows that in that use case, the duplicates during validation are negligible compared to all duplicates. Here is an example. I'm happy to provide more data on this. Now this is just one use case of gossipsub, and in general we should consider the validation delay as you've rightly pointed out.
I see, but this really depends on the assumption that validation time is long enough to create such a time window. And data as above tells me it is much smaller than network induced delays, so you mostly end up with symmetry.
Right. I'm not sure how this symmetry plays out, just raised it as something that might modify the actual effect compared to what we expect.
That's nice! I wasn't thinking of that delay distribution. Now say your 700ms peer is sending you first copies on 30% of chunks, and has long delay on 70%. The 100ms peer is sending you only duplicates, but it sends those with a 100ms average. Which one would you choke? I think I would go with the 100ms one. In effect I think we need another distribution, one that also adds with negative weight the cases when we receive first. This weight could be the time between the first and the 2nd receive. Or even that boosted, as 1st receive is very important compared to all the duplicates.
The speed, overhead, and usefulness of gossip is also very much dependent on message size. With very small it is almost useless, with large messages it saves a lot of bandwidth. Having said that, we also have metrics live on IHAVE/IWANT hit ratios, that can help. Happy to share those. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with all this. I think the current plan is to build some generic functions that decide whether to choke/unchoke peers based on some message timing data. So we can play with some arbitrary distributions and decisions about choking/unchoking and see how they play out in real networks. Once we have something concrete, I'll run some ideas past you. :) |
||
| `fanout_addition_threshold` | How aggressively we add peers from the fanout into the mesh. The percentage of messages that we receive in the `choke_heartbeat_interval` that were received from a fanout peer. | 10 | | ||
|
||
## Choking | ||
|
||
Every `choke_heartbeat_interval` the router counts the number of valid (or not invalid) duplicate messages (note that the first message of its kind received is not a duplicate) and the time it took to receive each duplicate for each peer in each mesh that are not `CHOKE`'d. The router then filters which peers have sent duplicates over the `choke_duplicates_threshold` and sends `CHOKE` messages to at most `choke_churn` peers ordered by largest average latency. A router should ensure that at least `D_non_choke` peers remain in the mesh (and should not send `CHOKE` messages if this limit is to be violated) and should perform this check every heartbeat with the mesh maintenance. | ||
|
||
## UnChoking | ||
|
||
Every `choke_heartbeat_interval` the router counts the number of received valid messages obtained via `IWANT` (and hence gossip) from a `CHOKE`'d peer. If the percentage of received valid messages is greater then `unchoke_threshold` we send an `UNCHOKE` to randomly selected peers up to the `choke_churn` limit. | ||
|
||
## Fanout Addition | ||
|
||
Every `choke_heartbeat_interval` the router counts the number of received valid messages obtained via `IWANT` (and hence gossip) from `fanout` peers. If the percentage of received messages is greater than `fanout_addition_threshold` a random selection of these peers up to `D_max_add` are added to the mesh (provided the mesh bounds remain valid, i.e `D_high`). | ||
|
||
## Handling Gossipsub Scoring For Choked Peers | ||
|
||
TODO | ||
|
||
## Protobuf Extension | ||
|
||
The protobuf messages are identical to those specified in the [gossipsub v1.0.0 specification](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md) with the following control message modifications: | ||
|
||
```protobuf | ||
message RPC { | ||
// ... see definition in the gossipsub specification | ||
} | ||
|
||
message ControlMessage { | ||
repeated ControlIHave ihave = 1; | ||
repeated ControlIWant iwant = 2; | ||
repeated ControlGraft graft = 3; | ||
repeated ControlPrune prune = 4; | ||
repeated ControlChoke choke = 5; | ||
repeated ControlUnChoke unchoke = 6; | ||
} | ||
|
||
message ControlChoke { | ||
optional string topicID = 1; | ||
} | ||
|
||
message ControlUnChoke { | ||
optional string topicID = 1; | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make the conditions more explicit once we have more data.
Also, messages originating in the choked node will still have to be directly sent, we should probably mention this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed