Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: specs #116

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
160 changes: 160 additions & 0 deletions p2p/p2p.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# P2P

The P2P package mainly contains two services:

1) [Subscriber](#subscriber)
2) [Exchange](#exchange)

## Subscriber
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must clearly mention somewhere that the gossiped message is only a serialized header with user-defined serialization and that protocol does not add any metadata on top.


Subscriber is a service that manages the gossip of headers among the nodes in the P2P network by using [libp2p][libp2p] and its [pubsub][pubsub] modules. The pubsub topic is used for gossip (`/<networkID>/header-sub/v0.0.1`) and is configurable based on the `networkID` parameter used to initialize the subscriber service.

The Subscriber encompasses the behavior necessary to subscribe/unsubscribe from new Header events from the network. The Subscriber interface consists of:

|Method|Input|Output|Description|
|--|--|--|--|
| Subscribe | | Subscription[H], error | Subscribe creates long-living Subscription for validated Headers. Multiple Subscriptions can be created. |
| SetVerifier | func(context.Context, H) error | error | SetVerifier registers verification func for all Subscriptions. Registered func screens incoming headers before they are forwarded to Subscriptions. Only one func can be set.|

The `Subscribe()` method allows listening to any new headers that are published to the P2P network. The `SetVerifier()` method allows for setting a custom verifier that will be executed upon receiving any new headers from the P2P network. This is a very useful customization for the consumers of go-header library to pass any custom logic as part of the pubsub. While multiple simultaneous subscriptions are possible via `Subscribe()` interface, only a single verifier can be set using the `SetVerifier` interface method.

## Exchange

An exchange is a combination of:

* [Exchange](#exchange-client): a client for requesting headers from the P2P network (outbound)
* [ExchangeServer](#exchange-server): a P2P server for handling inbound header requests

### Exchange Client

Exchange defines a client for requesting headers from the P2P network. An exchange client is initialized using self [host.Host][host], a list of peers in the form of slice [peer.IDSlice][peer], and a [connection gater][gater] for blocking and allowing nodes. Optional parameters like `ChainID` and `NetworkID` can also be passed. The exchange client also maintains a list of trusted peers via a peer tracker. The peer tracker will continue to discover peers until:

* the total peers (connected and disconnected) does not exceed [`maxPeerTrackerSize`][maxPeerTrackerSize] or
* connected peer count does not exceed disconnected peer count.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? Connected peer count should always be less than disconnected peer count?

Feels like this needs some explanation.


A set of client parameters (shown in the table below) can be passed while initializing an exchange client.

|Parameter|Type|Description|Default|
|--|--|--|--|
| MaxHeadersPerRangeRequest | uint64 | MaxHeadersPerRangeRequest defines the max amount of headers that can be requested per 1 request. | 64 |
| RangeRequestTimeout | time.Duration | RangeRequestTimeout defines a timeout after which the session will try to re-request headers from another peer. | 8s |
| chainID | string | chainID is an identifier of the chain. | "" |

#### Peer Tracker
gupadhyaya marked this conversation as resolved.
Show resolved Hide resolved

The three main functionalities of the peer tracker are:

* bootstrap
* track
* garbage collection (gc)

When the exchange client is started, it bootstraps the peer tracker using the set of trusted peers used to initialize the exchange client.

The new peers are tracked by subscribing to `event.EvtPeerConnectednessChanged{}`.

The peer tracker also runs garbage collector (gc) that removes the disconnected peers (determined as disconnected for more than [maxAwaitingTime][maxAwaitingTime] or connected peers whose scores are less than or equal to [defaultScore][defaultScore]) from the tracked peers list once every [gcCycle][gcCycle].

The peer tracker also provides a block peer functionality which is used to block peers that send invalid network headers. Invalid header is a header that fails when `Verify` method of the header interface is invoked.

#### Getter Interface

The exchange client implements the following `Getter` interface which contains the behavior necessary for a component to retrieve headers that have been processed during header sync. The `Getter` interface consists of:

|Method|Input|Output|Description|
|--|--|--|--|
| Head | context.Context, ...HeadOption[H] | H, error | Head returns the latest known chain header. Note that "chain head" is subjective to the component reporting it. |
| Get | context.Context, Hash | H, error | Get returns the Header corresponding to the given hash. |
| GetByHeight | context.Context, uint64 | H, error | GetByHeight returns the Header corresponding to the given block height. |
| GetRangeByHeight | ctx context.Context, from H, to uint64 | []H, error | GetRangeByHeight requests the header range from the provided Header and verifies that the returned headers are adjacent to each other. Expected to return the range [from.Height()+1:to).|

`Head()` method requests the latest header from trusted or tracked peers. The `Head()` call also allows passing an optional `TrustedHead`, which allows the caller to specify a trusted head against which the untrusted head is verified. By default, `Head()` requests only trusted peers and if `TrustedHead` is provided untrusted tracked peers are also requested, limited to [maxUntrustedHeadRequests][maxUntrustedHeadRequests]. The `Head()` requests utilize 90% of the set deadline (in the form of context deadline) for requests and the remaining for determining the best head from gathered responses. Upon receiving headers from peers (either trusted or tracked), the best head is determined as the head:

* with max height among the received
* which is received from at least [minHeadResponses][minHeadResponses] peers
* when neither or both conditions meet, the head with highest height is used

Apart from requesting the latest header, any arbitrary header(s) can be requested (with 3 retries) using height (`GetByHeight`), hash (`Get`), or range (`GetRangeByHeight`) from trusted peers as defined in the request proto message which encapsulates all three kinds of header requests:

```
message HeaderRequest {
oneof data {
uint64 origin = 1;
bytes hash = 2;
}
uint64 amount = 3;
}
```
Comment on lines +78 to +86
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[blocking]
Let's separate the protocol from implementation. I suggest adding ### Protocol paragraph, which contains all the messages and explains the basic flow. The method description of components can then link to it.


Note that, `GetRangeByHeight` as it ensures that the returned headers are correct against the begin header of the range.

### Exchange Server

ExchangeServer represents the server-side component of the exchange (a P2P server) for responding to inbound header requests. The exchange server needs to be initialized using self [host.Host][host] and a [store][store]. Optional `ServerParameters` as shown below, can be set during the server initialization.

|Parameter|Type|Description|Default|
|--|--|--|--|
| WriteDeadline | time.Duration | WriteDeadline sets the timeout for sending messages to the stream | 8s |
| ReadDeadline | time.Duration | ReadDeadline sets the timeout for reading messages from the stream | 60s |
| RangeRequestTimeout | time.Duration | RangeRequestTimeout defines a timeout after which the session will try to re-request headers from another peer | 10s |
| networkID | string | networkID is a network that will be used to create a protocol.ID | "" |

During the server start, a request handler for the `protocolID` (`/networkID/header-ex/v0.0.3`) which defined using the `networkID` configurable parameter is setup to serve the inbound header requests.

The request handler returns a response which contains bytes of the requested header(s) and a status code as shown below.

```
message HeaderResponse {
bytes body = 1;
StatusCode statusCode = 2;
}
```

The `OK` status code for success, `NOT_FOUND` for requested headers not found, and `INVALID` for error (default).

```
enum StatusCode {
INVALID = 0;
OK = 1;
NOT_FOUND = 2;
}
```

The request handler utilizes its local [store][store] for serving the header requests and only up to [MaxRangeRequestSize][MaxRangeRequestSize] of 512 headers can be requested while requesting headers by range. If the requested range is not available, the range is reset to whatever is available.

### Session

Session aims to divide a header range requests into several smaller requests among different peers. This service is used by the exchange client for making the `GetRangeByHeight`.

## Metrics

Currently only following metrics are collected:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the list is already extended for Subscriber, and more will be added soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take care of updating the spec


* P2P header exchange response size
* Duration of the get headers request in seconds
* Total synced headers

# References

[1] [libp2p][libp2p]

[2] [pubsub][pubsub]

[3] [host.Host][host]

[4] [peer.IDSlice][peer]

[5] [connection gater][gater]

[libp2p]: https://github.com/libp2p/go-libp2p
[pubsub]: https://github.com/libp2p/go-libp2p-pubsub
[host]: https://github.com/libp2p/go-libp2p/core/host
[peer]: https://github.com/libp2p/go-libp2p/core/peer
[gater]: https://github.com/libp2p/go-libp2p/p2p/net/conngater
[store]: https://github.com/celestiaorg/go-header/blob/main/store/store.md
[maxPeerTrackerSize]: https://github.com/celestiaorg/go-header/blob/main/p2p/peer_tracker.go#L19
[maxAwaitingTime]: https://github.com/celestiaorg/go-header/blob/main/p2p/peer_tracker.go#L25
[defaultScore]: https://github.com/celestiaorg/go-header/blob/main/p2p/peer_tracker.go#L17
[gcCycle]: https://github.com/celestiaorg/go-header/blob/main/p2p/peer_tracker.go#L27
[maxUntrustedHeadRequests]: https://github.com/celestiaorg/go-header/blob/main/p2p/exchange.go#L32
[minHeadResponses]: https://github.com/celestiaorg/go-header/blob/main/p2p/exchange.go#L28
[MaxRangeRequestSize]: https://github.com/celestiaorg/go-header/blob/main/interface.go#L13
34 changes: 33 additions & 1 deletion specs/src/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,35 @@
# Welcome

Welcome to the go-header Specifications.
Welcome to the go-header specifications.

go-header is a library for syncing blockchain data, such as block headers, over the P2P network in a trust-minimized way. It contains services for requesting and receiving headers from the P2P network, serving header requests from other nodes in the P2P network, storing headers, and syncing historical headers in case of fallbacks.

|Component|Description|
|---|---|
|[Subscriber](specs/p2p.md#subscriber)|listens for new headers from the P2P network|
|[ExchangeServer](specs/p2p.md#exchange-server)|serve header requests from other nodes in the P2P network|
|[Exchange](specs/p2p.md#exchange-client)|client that requests headers from other nodes in the P2P network|
|[Store](specs/store.md)|storing headers and making them available for access by other services such as exchange and syncer|
|[Syncer](specs/sync.md)|syncing of historical and new headers from the P2P network|

The go-header library defines a clear interface (as described in the table below) for consumption. Any blockchain data implementing this interface can utilize go-header's P2P services. An example is defined in [headertest/dummy_header.go][dummy header]

|Method|Input|Output|Description|
|--|--|--|--|
| New | | H | New creates new instance of a header. |
| IsZero | | bool | IsZero reports whether Header is a zero value of it's concrete type. |
| ChainID | | string | ChainID returns identifier of the chain. |
| Hash | | Hash | Hash returns hash of a header. |
| Height | | uint64 | Height returns the height of a header. |
| LastHeader | | Hash | LastHeader returns the hash of last header before this header (aka. previous header hash). |
| Time | | time.Time | Time returns time when header was created. |
| Verify | H | error | Verify validates given untrusted Header against trusted Header. |
| Validate | | error | Validate performs stateless validation to check for missed/incorrect fields. |
| MarshalBinary | | []byte, error| MarshalBinary encodes the receiver into a binary form and returns the result. |
| UnmarshalBinary | []byte | error | UnmarshalBinary must be able to decode the form generated by MarshalBinary. UnmarshalBinary must copy the data if it wishes to retain the data after returning.|

# References

[1] [Dummy Header][dummy header]

[dummy header]: https://github.com/celestiaorg/go-header/blob/main/headertest/dummy_header.go
40 changes: 40 additions & 0 deletions store/store.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Store

Store implements the Store interface (shown below) for headers over .

Store encompasses the behavior necessary to store and retrieve headers from a node's local storage ([datastore][go-datastore]). The Store interface includes checker and append methods on top of [Getter](../p2p/p2p.md#getter-interface) methods as shown in the table below.

|Method|Input|Output|Description|
|--|--|--|--|
| Init | context.Context, H | error | Init initializes Store with the given head, meaning it is initialized with the genesis header. |
| Height | | uint64 | Height reports current height of the chain head. |
| Has | context.Context, Hash | bool, error | Has checks whether Header is already stored. |
| HasAt | context.Context, uint64 | bool | HasAt checks whether Header at the given height is already stored. |
| Append | context.Context, ...H | error | Append stores and verifies the given Header(s). It requires them to be adjacent and in ascending order, as it applies them contiguously on top of the current head height. It returns the amount of successfully applied headers, so caller can understand what given header was invalid, if any. |

A new store is created by passing a [datastore][go-datastore] instance and an optional head. If the head is not passed while creating a new store, `Init` method can be used to later initialize the store with head. The store must have a head before start. The head is considered trusted header and generally it is the genesis header. A custom store prefix can be passed during the store initialization. Further, a set of parameters can be passed during the store initialization to configure the store as described below.

|Parameter|Type|Description|Default|
|--|--|--|--|
| StoreCacheSize | int | StoreCacheSize defines the maximum amount of entries in the Header Store cache. | 4096 |
| IndexCacheSize | int | IndexCacheSize defines the maximum amount of entries in the Height to Hash index cache. | 16384 |
| WriteBatchSize | int | WriteBatchSize defines the size of the batched header write. Headers are written in batches not to thrash the underlying Datastore with writes. | 2048 |
| storePrefix | datastore.Key | storePrefix defines the prefix used to wrap the store | nil |

The store runs a flush loop during the start which performs writing task to the underlying datastore in a separate routine. This way writes are controlled and manageable from one place allowing:

* `Append`s not blocked on long disk IO writes and underlying DB compactions
* Helps with batching header writes

`Append` appends a list of headers to the store head. It requires that all headers to be appended are adjacent to each other (sequential). Also, append invokes adjacency header verification by calling the `Verify` header interface method to ensure that only verified headers are appended. As described above, append does not directly writes to the underlying datastore, which is taken care by the flush loop.

`Has` method checks if a header with a given hash exists in the store. The check is performed on a cache ([lru.ARCCache][lru.ARCCache]) first, followed by the pending queue which contains headers that are not flushed (written to disk), and finally the datastore. The `Get` method works similar to `Has`, where the retrieval first checks cache, followed by the pending queue, and finally the datastore (disk access).

# References

[1] [datastore][go-datastore]

[2] [lru.ARCCache][lru.ARCCache]

[go-datastore]: https://github.com/ipfs/go-datastore
[lru.ARCCache]: https://github.com/hashicorp/golang-lru
67 changes: 67 additions & 0 deletions sync/sync.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Sync

Syncer implements efficient synchronization for headers.

There are two main processes running in Syncer:

* Main syncing loop(`syncLoop`)
* Performs syncing from the latest stored header up to the latest known Subjective Head
* Subjective head: the latest known local valid header and a sync target.
* Network head: the latest valid network-wide header. Becomes subjective once applied locally.
* Syncs by requesting missing headers from Exchange or
* By accessing cache of pending headers
* Receives every new Network Head from PubSub gossip subnetwork (`incomingNetworkHead`)
* Validates against the latest known Subjective Head, is so
* Sets as the new Subjective Head, which
* If there is a gap between the previous and the new Subjective Head
* Triggers s.syncLoop and saves the Subjective Head in the pending so s.syncLoop can access it

For creating a new instance of the Syncer following components are needed:

* A getter, e.g., [Exchange][exchange]
* A [Store][store]
* A [Subscriber][subscriber]
* Additional options such as block time. More options as described below.

Options for configuring the syncer:

|Parameter|Type|Description|Default|
|--|--|--|--|
| TrustingPeriod | time.Duration | TrustingPeriod is period through which we can trust a header's validators set. Should be significantly less than the unbonding period (e.g. unbonding period = 3 weeks, trusting period = 2 weeks). More specifically, trusting period + time needed to check headers + time needed to report and punish misbehavior should be less than the unbonding period. | 336 hours (tendermint's default trusting period) |
| blockTime | time.Duration | blockTime provides a reference point for the Syncer to determine whether its subjective head is outdated. Keeping it private to disable serialization for it. | 0 (reason: syncer will constantly request networking head.) |
| recencyThreshold | time.Duration | recencyThreshold describes the time period for which a header is considered "recent". | blockTime + 5 seconds |

When the syncer is started:

* `incomingNetworkHead` is set as validator for any incoming header over the subscriber
* Retrieve the latest head to kick off syncing. Note that, syncer cannot be started without head.
* `syncLoop` is started, which listens to sync trigger

## Fetching the Head to Start Syncing

Known subjective head is considered network head if it is recent (`now - timestamp <= blocktime`). Otherwise, a head is requested from a trusted peer and set as the new subjective head, assuming that trusted peer is always fully synced.

* If the event of network not able to be retrieved and subjective head is not recent, as fallback, the subjective head is used as head.
* The network head retrieved is subjected to validation (via `incomingNetworkHead`) before setting as the new subjective head.

## Verify

The header interface defines a `Verify` method which gets invoked when any new header is received via `incomingNetworkHead`.

|Method|Input|Output|Description|
|--|--|--|--|
| Verify | H | error | Verify validates given untrusted Header against trusted Header. |

## syncLoop

When a new network head is received which gets validated and set as subjective head, it triggers the `syncLoop` which tries to sync headers from old subjective head till new network head (the sync target) by utilizing the `getter`(the `Exchange` client).

# References

[1] [Exchange][exchange]

[2] [Subscriber][subscriber]

[exchange]: https://github.com/celestiaorg/go-header/blob/main/p2p/exchange.go
[subscriber]: https://github.com/celestiaorg/go-header/blob/main/p2p/subscriber.go
[store]: https://github.com/celestiaorg/go-header/blob/main/store/store.md