From eba0199283c8d719e73437d47ca4924084547fce Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Mon, 20 Jan 2025 12:30:53 +0100 Subject: [PATCH 01/10] reconstruction draft --- docs/adr/reconstruction.md | 289 +++++++++++++++++++++++++++++++++++++ 1 file changed, 289 insertions(+) create mode 100644 docs/adr/reconstruction.md diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md new file mode 100644 index 0000000000..cdc37fa308 --- /dev/null +++ b/docs/adr/reconstruction.md @@ -0,0 +1,289 @@ +# DRAFT + +This document is wip reconstruction protocol descrtipion. It is high level overview of proposal to spark the first iteration of discussion. The document will be split into CIP and ADR later. +## Abstract + +This document proposes a new block reconstruction protocol that addresses key bottlenecks in the current implementation, specifically targeting duplicate request reduction and bandwidth optimization. The protocol introduces a structured approach to data retrieval with explicit focus on network resource efficiency and scalability. + +## Motivation + +The current block reconstruction protocol faces several limitations: +1. High frequency of duplicate requests leading to network inefficiency +2. Suboptimal bandwidth utilization +3. Limited scalability with increasing block sizes and node count + +Key improvements include: +- Structured bitmap sharing +- Optimized proof packaging +- Efficient state management +- Robust failure handling + +This proposal aims to implement an efficient reconstruction protocol that: +- Minimizes duplicate data requests +- Optimizes bandwidth usage through smart data packaging +- Supports network scaling across multiple dimensions +- Maintains stability under varying network conditions + +Engineering time. + +Initial draft aims to be optimised in terms of engineering efforts required for first iteration of implementation. Optimizations marked as optional can be implemented in subsequent updates based on network performance metrics. + +## Specification + +### Performance Requirements + +1. Base Scenario Support: + - 32MB block size + - Minimum required light nodes for 32MB blocks + - Network of 50+ full nodes + +2. Performance Metrics (in order of priority): + - System stability + - Reconstruction throughput (blocks/second) + - Per-block reconstruction time + +3. Scalability Dimensions: + - Block size scaling + - Node count scaling + + + +### Protocol Flow + +``` +1. Connection Establishment + Client Server + |---- Handshake (node type) ----->| + |<---- Handshake (node type) -----| + +2. Bitmap Subscription + Client Server + |---- Subscribe to bitmap ------------->| + |<---- Initial bitmap ------------------| + |<---- Updates ------------------------| + |<---- End updates(full eds/max samples-| + +3. Data Request + Client Server + |---- Request(bitmap) ----------->| + |<---- [Samples + Proof] parts ---| +``` + +### Bitmap Implementation + +The protocol utilizes Roaring Bitmaps for efficient bitmap operations and storage. Roaring Bitmaps provide several advantages for the reconstruction protocol: + +1. Efficient Operations + - Fast logical operations (AND, OR, XOR) + - Optimized for sparse and dense data + - Memory-efficient storage + +2. Implementation Benefits + - Native support for common bitmap operations + - Optimized serialization + - Efficient iteration over set bits + - Support for rank/select operations + +3. Performance Characteristics + - O(1) for most common operations + - Compressed storage format + - Efficient memory utilization + - Fast bitmap comparisons + + +### Core Protocol Components + +#### 1. Connection Management + +Handshakes, Node Type Identification +- Node type must be declared during connection handshake by each of connection nodes +- Active connection state maintenance on full nodes. List should be maintained by node type +- Connection state should allow subscription for updated + +#### 2. Sample Bitmap Protocol + +2.1. Bitmap Subscription Flow + +- Requested by block height +- Implements one-way stream for bitmap updates +- Full nodes: Stream until complete EDS bitmap +- Light nodes: Stream until max sampling amount reached +- Include end-of-subscription flag for easier debugging + +2.2. Update Requirements +- Minimum update frequency: every 10 seconds +- Server-side update triggers: + - Time-based: Every 5 seconds. Default in base implementation. + - Optional: Change-based with threshold +- Penalty system for delayed updates + +#### 3. Reconstruction State Management + +3.1. Global State Components +```go +type RemoteState struct { + coords [][]peers // Peer lists by coordinates + rows []peers // Peers with row data + cols []peers // Peers with column data + available bitmap // Available samples bitmap +} + +// Basic peer list structure.Structure might be replaced later to implement better +//peer scoring mechanics +type peers []peer.ID +``` + +3.2. State Query Interface + +Remote state: +- GetPeersWithSample(Coords) -> []peer.ID +- GetPeersWithAxis(axisIdx, axisType) -> []peer.ID +- Available() -> bitmap + +Local state: +- InProgress() -> bitmap. Tracks ongoing fetch sessions to prevent requesting duplicates +- Have() -> bitmap. Tracks + + + +#### 4. Data Request +There should be global per block coordinator process that will be responsible for managing the data request process. + +4.1. Request Initiation may have multiple strategies: +- Immediate request of all missing samples bitmap receipt +- Optional: Delayed start for bandwidth optimization + - Wait for X% peer responses + - Fixed time delay + - Complete EDS availability + - Pre-confirmation threshold + - Combination of conditions + +4.1.1 what and were (to request) biggest question. + +First iteration of decision engine can be done as simple as possible to allow easier testing of other components and prove the concept. The base properties should be: +- Eliminate requests for duplicate data +- Do not request data, that can be derived from other data. Request just enough data for successful reconstruction + + +Following potential optimization trategies +- Skip encoded derivable data in response by requesting from peers that have all shares from the same rows/columns +- Optimize proof sizes through range requests +- Optimize proof sizes through subtree proofs, if adjacent subroots are stored +- Parallel request distribution to reduce network load on single peers +- Request from peers with smaller latency + + +4.2. Sample Request Protocol + +4.2.1. Request + +- Update InProgress bitmap before request initiation +- Use shrex for data retrieval. Send bitmap for data request. Shrex has built-in support for requesting rows and samplies, however bitmap-based data retrieval is more general and would be easier to support in future, because it allows requesting multiples pieces of data with single request instead of multiple requests +```go +type SampleRequest struct { + Samples *roaring.Bitmap +} +``` + + +4.2.2. Response +- (base version) Server should respond with samples with proofs + +Optiomisations: +- Server may split response into multiple parts +Each part contains packed samples in Range format. +- Each part of shoold have specified message prefix. It would allow client to identify sample container type, which would help to maintain backwards compatibility in case of breaking changes in packing algorithm +- Adjacent samples can be packed together to share common proofs +- If both Row and column are requested, intersections share can be sent once. + + +4.3 Client response handling +- If response is timeout, retry request from other peers. Potentially penalize slow peer. +- Client should verify proofs. If proof is invalid, peer should be penalized +- Verified samples should stored and added to local state `Have` +- Clean up InProgress bitmap + - Clean up information from remote state to free up memory +- If reconstruction is complete clean up local state, shut down reconstruction process and all subscriptions + +### Server-Side Implementation + +#### 1. Storage Interface +New storage format needs to be implemented for efficient storage of sampels proofs. The format will be used +initially used for storing ongoing reconstruction process and later can be used for light node storage. + +1.1. Core Requirements +- Sample storage with proofs +- Allow purge of proofs on successful reconstruction +- Bitmap query support +- Row/column access implementation +- Accessor interface compliance + +1.2. Optional Optimizations +- Bitmap subscription support for callbacks +- Efficient proof generation to reduce proofs size overhead + +#### 2. Request Processing + +2.1. Sample Request Handling +- Process bitmap-based requests +- Support multi-part responses +- Implement range-based packaging +- Optimize proof generation + +2.2. Response Optimization +- Common proof sharing +- Interval-based response packaging +- Share deduplication for intersections + + + +### Optimization Details + +1. Bandwidth Optimization + - Share common proofs across samples + - Package adjacent samples + - Optimize proof sizes for known data + - Implement efficient encoding schemes + +2. Request Distribution + - Load balancing across peers + - Geographic optimization + - Parallel request handling + - Adaptive timeout management + +3. State Management Optimization + - Efficient bitmap operations + - Memory-optimized state tracking + - Progressive cleanup of completed data + - Optimized proof verification + +## Rationale + +The design decisions in this proposal are driven by: +1. Need for efficient bandwidth utilization +2. Importance of stable reconstruction under varying conditions +3. Support for network scaling +4. Maintenance of security guarantees + + + +## Backwards Compatibility + +In lifespan of protocol it may requires a coordinated network upgrade. Implementation should allow for: +1. Version negotiation +2. Transition period support +3. Fallback mechanisms + +## List of core components + + +1. Handshake protocol +2. Active connections management +3. Bitmap subscription protocol +4. Decision engine and state management +5. Client to request set of samples +6. Samples store +7. Samples server with packaging + + + From c5ceeeca3c8818d8c9a6446afd85d8c0d715d02e Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Mon, 20 Jan 2025 12:40:26 +0100 Subject: [PATCH 02/10] reconstruction draft --- docs/adr/reconstruction.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index cdc37fa308..cc525f0041 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -1,6 +1,3 @@ -# DRAFT - -This document is wip reconstruction protocol descrtipion. It is high level overview of proposal to spark the first iteration of discussion. The document will be split into CIP and ADR later. ## Abstract This document proposes a new block reconstruction protocol that addresses key bottlenecks in the current implementation, specifically targeting duplicate request reduction and bandwidth optimization. The protocol introduces a structured approach to data retrieval with explicit focus on network resource efficiency and scalability. From f085b228c6af58cfb12878c8b06d6ff54af0d334 Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Mon, 27 Jan 2025 18:07:27 +0100 Subject: [PATCH 03/10] refactor --- docs/adr/reconstruction.md | 427 ++++++++++++++++++++----------------- 1 file changed, 228 insertions(+), 199 deletions(-) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index cc525f0041..95bbc1a199 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -1,6 +1,6 @@ ## Abstract -This document proposes a new block reconstruction protocol that addresses key bottlenecks in the current implementation, specifically targeting duplicate request reduction and bandwidth optimization. The protocol introduces a structured approach to data retrieval with explicit focus on network resource efficiency and scalability. +This document proposes a new block reconstruction protocol that addresses key bottlenecks in the current implementation, specifically targeting duplicate request reduction and bandwidth optimization. The protocol introduces a structured approach to data retrieval with an explicit focus on network resource efficiency and scalability. ## Motivation @@ -21,266 +21,295 @@ This proposal aims to implement an efficient reconstruction protocol that: - Supports network scaling across multiple dimensions - Maintains stability under varying network conditions -Engineering time. +#### Engineering Time -Initial draft aims to be optimised in terms of engineering efforts required for first iteration of implementation. Optimizations marked as optional can be implemented in subsequent updates based on network performance metrics. +The initial draft aims to be optimized in terms of engineering efforts required for the first iteration of implementation. Optimizations marked as optional can be implemented in subsequent updates based on network performance metrics. ## Specification -### Performance Requirements +### General Performance Requirements 1. Base Scenario Support: - - 32MB block size - - Minimum required light nodes for 32MB blocks - - Network of 50+ full nodes + - 32MB block size + - Minimum required light nodes for 32MB blocks + - Network of 50+ full nodes 2. Performance Metrics (in order of priority): - - System stability - - Reconstruction throughput (blocks/second) - - Per-block reconstruction time + - System stability + - Reconstruction throughput (blocks/second) + - Per-block reconstruction time 3. Scalability Dimensions: - - Block size scaling - - Node count scaling + - Block size scaling + - Node count scaling +### Protocol Flow - -### Protocol Flow - +Diagram below outlines the high-level flow of the proposed protocols. The detailed specifications are provided in the subsequent sections. Full flow diiagrams are available in the end of this document. ``` -1. Connection Establishment - Client Server - |---- Handshake (node type) ----->| - |<---- Handshake (node type) -----| - -2. Bitmap Subscription +1. Bitmap Subscription Client Server - |---- Subscribe to bitmap ------------->| - |<---- Initial bitmap ------------------| - |<---- Updates ------------------------| - |<---- End updates(full eds/max samples-| + |---- Subscribe to bitmap -------------->| + |<---- Initial bitmap -------------------| + |<---- Updates -------------------------| + |<---- End updates(full eds/max samples)-| -3. Data Request +2. Data Request Client Server |---- Request(bitmap) ----------->| |<---- [Samples + Proof] parts ---| ``` -### Bitmap Implementation - -The protocol utilizes Roaring Bitmaps for efficient bitmap operations and storage. Roaring Bitmaps provide several advantages for the reconstruction protocol: - -1. Efficient Operations - - Fast logical operations (AND, OR, XOR) - - Optimized for sparse and dense data - - Memory-efficient storage - -2. Implementation Benefits - - Native support for common bitmap operations - - Optimized serialization - - Efficient iteration over set bits - - Support for rank/select operations - -3. Performance Characteristics - - O(1) for most common operations - - Compressed storage format - - Efficient memory utilization - - Fast bitmap comparisons - - -### Core Protocol Components +## Core Components -#### 1. Connection Management +### 1. Reconstruction Process +There should be a global per-block coordinator process that will be responsible for managing the data request process. -Handshakes, Node Type Identification -- Node type must be declared during connection handshake by each of connection nodes -- Active connection state maintenance on full nodes. List should be maintained by node type -- Connection state should allow subscription for updated +1. Request Initiation may have multiple strategies: + - Immediate request of all missing samples upon bitmap receipt + - (Optional): Delayed start for bandwidth optimization + - Wait for X% peer responses + - Fixed time delay + - Complete EDS availability + - Pre-confirmation threshold + - Combination of conditions -#### 2. Sample Bitmap Protocol +2. Select which samples to request and from which peers -2.1. Bitmap Subscription Flow - -- Requested by block height -- Implements one-way stream for bitmap updates -- Full nodes: Stream until complete EDS bitmap -- Light nodes: Stream until max sampling amount reached -- Include end-of-subscription flag for easier debugging - -2.2. Update Requirements -- Minimum update frequency: every 10 seconds -- Server-side update triggers: - - Time-based: Every 5 seconds. Default in base implementation. - - Optional: Change-based with threshold -- Penalty system for delayed updates - -#### 3. Reconstruction State Management +The first iteration of the decision engine can be implemented as simply as possible to allow easier testing of other components and prove the concept. The base properties should be: +- Eliminate requests for duplicate data +- Do not request data that can be derived from other data. Request just enough data for successful reconstruction + +#### First Implementation: +1. Subscribe to bitmap updates +2. Handle bitmap updates. If any sample is not stored and not in progress, request it from a peer + - Keep track of in-progress requests in local state +3. Handle sample responses + - Verify proofs. If a proof is invalid, the peer should be penalized + - Store samples in local store and update local Have state + - Remove samples from in-progress bitmap + - Clean up information about sample coordinates from remote state to free up memory +4. If reconstruction is complete, clean up local state and shut down the reconstruction process + +#### Potential Optimizations +- Skip encoded derivable data in response by requesting from peers that have all shares from the same rows/columns +- Optimize proof sizes through range requests +- Optimize proof sizes through subtree proofs if adjacent subroots are stored +- Parallel request distribution to reduce network load on single peers +- Request from peers with smaller latency -3.1. Global State Components +### 2. State Management +1. Remote State will store information about peers that have samples for given coordinates. If it has full rows/columns, it will be stored in a separate list. ```go type RemoteState struct { coords [][]peers // Peer lists by coordinates - rows []peers // Peers with row data - cols []peers // Peers with column data + rows []peers // Peers with full row data + cols []peers // Peers with full column data available bitmap // Available samples bitmap } -// Basic peer list structure.Structure might be replaced later to implement better -//peer scoring mechanics +// Basic peer list structure. Structure might be replaced later to implement better +// peer scoring mechanics type peers []peer.ID ``` -3.2. State Query Interface +Query Interface Remote state: -- GetPeersWithSample(Coords) -> []peer.ID -- GetPeersWithAxis(axisIdx, axisType) -> []peer.ID -- Available() -> bitmap - -Local state: -- InProgress() -> bitmap. Tracks ongoing fetch sessions to prevent requesting duplicates -- Have() -> bitmap. Tracks - - - -#### 4. Data Request -There should be global per block coordinator process that will be responsible for managing the data request process. - -4.1. Request Initiation may have multiple strategies: -- Immediate request of all missing samples bitmap receipt -- Optional: Delayed start for bandwidth optimization - - Wait for X% peer responses - - Fixed time delay - - Complete EDS availability - - Pre-confirmation threshold - - Combination of conditions - -4.1.1 what and were (to request) biggest question. - -First iteration of decision engine can be done as simple as possible to allow easier testing of other components and prove the concept. The base properties should be: -- Eliminate requests for duplicate data -- Do not request data, that can be derived from other data. Request just enough data for successful reconstruction - - -Following potential optimization trategies -- Skip encoded derivable data in response by requesting from peers that have all shares from the same rows/columns -- Optimize proof sizes through range requests -- Optimize proof sizes through subtree proofs, if adjacent subroots are stored -- Parallel request distribution to reduce network load on single peers -- Request from peers with smaller latency +```go +func (s *RemoteState) GetPeersWithSample(coords []Coord) []peer.ID +func (s *RemoteState) GetPeersWithAxis(axisIdx int, axisType AxisType) []peer.ID +func (s *RemoteState) Available() bitmap +``` +Progress state: +```go +// Tracks ongoing fetch sessions to prevent requesting duplicates +func (s *ProgressState) InProgress() bitmap +// Tracks samples that are already stored locally to notify peers about it +func (s *ProgressState) Have() bitmap +``` -4.2. Sample Request Protocol +## Bitmap Protocol +### Client +- Client should send a request to subscribe to bitmap updates +- If the subscription gets closed or interrupted, client should re-subscribe -4.2.1. Request +#### Request +```protobuf +message SubscribeBitmapRequest { + uint64 height = 1; +} +``` -- Update InProgress bitmap before request initiation -- Use shrex for data retrieval. Send bitmap for data request. Shrex has built-in support for requesting rows and samplies, however bitmap-based data retrieval is more general and would be easier to support in future, because it allows requesting multiples pieces of data with single request instead of multiple requests -```go -type SampleRequest struct { - Samples *roaring.Bitmap +### Server +- Server implements a one-way stream for bitmap updates +- Server should send the first bitmap update immediately +- Next updates should be sent every 5 seconds + - (Optional): Server can send updates more frequently if there is a significant change in the bitmap +- Server should send an end-of-subscription flag when no more updates are expected + - Full nodes: stream until EDS is available on server + - Light nodes: stream until max sampling amount is reached + - (Optional): Light node can send a single response with bitmap and end flag upon successful sampling + +#### Response +``` +message BitmapUpdate { + Bitmap bitmap = 1; + bool is_end = 2; } ``` +The protocol utilizes Roaring Bitmaps for efficient bitmap operations and storage. Roaring Bitmaps provide several advantages for the reconstruction protocol: -4.2.2. Response -- (base version) Server should respond with samples with proofs +1. Efficient Operations + - Fast logical operations (AND, OR, XOR) + - Optimized for sparse and dense data + - Memory-efficient storage -Optiomisations: -- Server may split response into multiple parts -Each part contains packed samples in Range format. -- Each part of shoold have specified message prefix. It would allow client to identify sample container type, which would help to maintain backwards compatibility in case of breaking changes in packing algorithm -- Adjacent samples can be packed together to share common proofs -- If both Row and column are requested, intersections share can be sent once. +2. Implementation Benefits + - Native support for common bitmap operations + - Optimized serialization + - Efficient iteration over set bits + - Support for rank/select operations +3. Performance Characteristics + - O(1) for most common operations + - Compressed storage format + - Efficient memory utilization + - Fast bitmap comparisons + +The protocol will use 32-bit encoding for bitmaps to have greater multi-language support. Implementation can use one of the encoding-compatible libraries: +- Go: https://github.com/RoaringBitmap/roaring +- Rust: https://github.com/RoaringBitmap/roaring-rs +- C++: https://github.com/RoaringBitmap/CRoaring +- Java: https://github.com/RoaringBitmap/RoaringBitmap + +## Samples Request Protocol + +### Request + +- Use shrex for data retrieval +- Send bitmap for data request. Bitmap should contain coordinates for requested samples +```protobuf +message SampleRequest { + height uint64 = 1; + Bitmap bitmap = 2; +} +``` -4.3 Client response handling -- If response is timeout, retry request from other peers. Potentially penalize slow peer. -- Client should verify proofs. If proof is invalid, peer should be penalized -- Verified samples should stored and added to local state `Have` -- Clean up InProgress bitmap - - Clean up information from remote state to free up memory -- If reconstruction is complete clean up local state, shut down reconstruction process and all subscriptions +### Response +Server should respond with samples with proofs defined in shwap CIP [past link]. +```protobuf +message SamplesResponse { + repeated Sample samples = 1; +} +``` -### Server-Side Implementation +#### Optimizations: +- Adjacent samples can have common proofs. Server would need to send shares with common proof in a single response. + Each part contains packed samples in Range format +- If both Row and column are requested, intersection share can be sent once -#### 1. Storage Interface -New storage format needs to be implemented for efficient storage of sampels proofs. The format will be used +## Storage Backend +A new storage format needs to be implemented for efficient storage of sample proofs. The format will be initially used for storing ongoing reconstruction process and later can be used for light node storage. -1.1. Core Requirements +1. Core Requirements - Sample storage with proofs - Allow purge of proofs on successful reconstruction - Bitmap query support - Row/column access implementation - Accessor interface compliance -1.2. Optional Optimizations +2. Optional Optimizations - Bitmap subscription support for callbacks -- Efficient proof generation to reduce proofs size overhead - -#### 2. Request Processing - -2.1. Sample Request Handling -- Process bitmap-based requests -- Support multi-part responses -- Implement range-based packaging -- Optimize proof generation - -2.2. Response Optimization -- Common proof sharing -- Interval-based response packaging -- Share deduplication for intersections - - - -### Optimization Details - -1. Bandwidth Optimization - - Share common proofs across samples - - Package adjacent samples - - Optimize proof sizes for known data - - Implement efficient encoding schemes - -2. Request Distribution - - Load balancing across peers - - Geographic optimization - - Parallel request handling - - Adaptive timeout management - -3. State Management Optimization - - Efficient bitmap operations - - Memory-optimized state tracking - - Progressive cleanup of completed data - - Optimized proof verification - -## Rationale - -The design decisions in this proposal are driven by: -1. Need for efficient bandwidth utilization -2. Importance of stable reconstruction under varying conditions -3. Support for network scaling -4. Maintenance of security guarantees - - +- Efficient proof generation to reduce proof size overhead ## Backwards Compatibility -In lifespan of protocol it may requires a coordinated network upgrade. Implementation should allow for: +In the lifespan of the protocol, it may require a coordinated network upgrade. Implementation should allow for: 1. Version negotiation 2. Transition period support 3. Fallback mechanisms -## List of core components - - -1. Handshake protocol -2. Active connections management -3. Bitmap subscription protocol -4. Decision engine and state management -5. Client to request set of samples -6. Samples store -7. Samples server with packaging - - - +## List of Core Components + +1. Bitmap subscription protocol +2. Decision engine and state management +3. Client to request set of samples +4. Samples store +5. Samples server with packaging + + +## Full reconstruction process diagram +```mermaid +sequenceDiagram + participant N as New Node + participant RS as Remote State + participant RP as Reconstruction Processor + participant PS as Progress State + participant SS as Samples Store + participant P1 as Peer 1 + participant P2 as Peer 2 + participant P3 as Peer 3 + + Note over N,P3: Phase 1: Bitmap Discovery + N->>+P1: Subscribe to bitmap updates + N->>+P2: Subscribe to bitmap updates + N->>+P3: Subscribe to bitmap updates + + P1-->>-N: Initial bitmap + P2-->>-N: Initial bitmap + P3-->>-N: Initial bitmap + + N->>RS: Update remote state + + Note over N,P3: Phase 2: Reconstruction Process Start + N->>RP: Initialize reconstruction + activate RP + + loop Process Bitmaps + RP->>PS: Check progress state + RP->>RS: Query available samples + + Note over RP: Select optimal samples & peers + + par Request Samples + RP->>P1: GetSamples(bitmap subset 1) + RP->>P2: GetSamples(bitmap subset 2) + RP->>P3: GetSamples(bitmap subset 3) + end + + PS->>PS: Mark requests as in-progress + end + + Note over N,P3: Phase 3: Sample Processing + par Process Responses + P1-->>RP: Samples + Proofs 1 + P2-->>RP: Samples + Proofs 2 + P3-->>RP: Samples + Proofs 3 + end + + loop For each response + RP->>SS: Verify & store samples + RP->>PS: Update progress + RP->>RS: Update available samples + end + + Note over N,P3: Phase 4: Completion + opt Reconstruction Complete + RP->>SS: Finalize reconstruction + RP->>PS: Clear progress state + RP->>RS: Clear remote state + deactivate RP + end + + Note over N,P3: Phase 5: Continuous Updates + loop Until Complete + P1-->>N: Bitmap updates + P2-->>N: Bitmap updates + P3-->>N: Bitmap updates + N->>RS: Update remote state + end +``` \ No newline at end of file From dc4c620b5375f6516dacda2749c8d9e043e70bb3 Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Mon, 27 Jan 2025 18:58:04 +0100 Subject: [PATCH 04/10] refactor 2 --- docs/adr/reconstruction.md | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index 95bbc1a199..608f22aa17 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -43,7 +43,7 @@ The initial draft aims to be optimized in terms of engineering efforts required - Block size scaling - Node count scaling -### Protocol Flow +## Protocol Flow Diagram below outlines the high-level flow of the proposed protocols. The detailed specifications are provided in the subsequent sections. Full flow diiagrams are available in the end of this document. ``` @@ -62,6 +62,14 @@ Diagram below outlines the high-level flow of the proposed protocols. The detail ## Core Components +#### List of Core Components + +1. Decision engine +2. State management +3. Bitmap subscription protocol +4. Samples retrieval protocol +5. Samples store (new file format) + ### 1. Reconstruction Process There should be a global per-block coordinator process that will be responsible for managing the data request process. @@ -130,7 +138,7 @@ func (s *ProgressState) InProgress() bitmap func (s *ProgressState) Have() bitmap ``` -## Bitmap Protocol +### 3. Bitmap Protocol ### Client - Client should send a request to subscribe to bitmap updates - If the subscription gets closed or interrupted, client should re-subscribe @@ -185,9 +193,9 @@ The protocol will use 32-bit encoding for bitmaps to have greater multi-language - C++: https://github.com/RoaringBitmap/CRoaring - Java: https://github.com/RoaringBitmap/RoaringBitmap -## Samples Request Protocol +### 4. Samples Request Protocol -### Request +#### Request - Use shrex for data retrieval - Send bitmap for data request. Bitmap should contain coordinates for requested samples @@ -198,7 +206,7 @@ message SampleRequest { } ``` -### Response +#### Response Server should respond with samples with proofs defined in shwap CIP [past link]. ```protobuf message SamplesResponse { @@ -211,7 +219,7 @@ message SamplesResponse { Each part contains packed samples in Range format - If both Row and column are requested, intersection share can be sent once -## Storage Backend +### 5. Storage Backend A new storage format needs to be implemented for efficient storage of sample proofs. The format will be initially used for storing ongoing reconstruction process and later can be used for light node storage. @@ -233,13 +241,7 @@ In the lifespan of the protocol, it may require a coordinated network upgrade. I 2. Transition period support 3. Fallback mechanisms -## List of Core Components -1. Bitmap subscription protocol -2. Decision engine and state management -3. Client to request set of samples -4. Samples store -5. Samples server with packaging ## Full reconstruction process diagram From 052a31f15e751a31d14c14e2af9fc4adb272fdfd Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Mon, 3 Feb 2025 12:42:29 +0100 Subject: [PATCH 05/10] added: - return handshake based on user agent. - Do not subscribe for samples from LN - Request from LN in batches --- docs/adr/reconstruction.md | 79 ++++++++++++++++++++++++++++++++------ 1 file changed, 67 insertions(+), 12 deletions(-) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index 608f22aa17..099ea0c852 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -43,23 +43,62 @@ The initial draft aims to be optimized in terms of engineering efforts required - Block size scaling - Node count scaling -## Protocol Flow - -Diagram below outlines the high-level flow of the proposed protocols. The detailed specifications are provided in the subsequent sections. Full flow diiagrams are available in the end of this document. +## Reconstruction Flow + +Once the Full node identifies it cannot fetch the block using shrex protocol, it will start the reconstruction process. +Full node would need to collect samples from connected Light nodes and Full nodes. It should also allow efficient relay of samples to other Full nodes. In order to +achieve this, it will need to implement the following reconstruction flow: +1. **Get samples from LN**. Start process of collecting samples from connected Light nodes + 1. Use GetSamples protocol to get samples from connected Light nodes + 2. To prevent congestion of returned samples, use batching. Request samples from connected Light nodes in batches with fixed size (e.g. 100 LN at a time). +2. **Subscribe to bitmap updates**. Subscribe to bitmap updates from connected Full nodes + 1. Use SubscribeBitmap protocol to subscribe to bitmap updates from connected Full nodes + 2. If returned bitmap has samples that are not present in the node store, request samples from Full node using GetSamples protocol + +#### No bitmap subscription from Light nodes. Why? +Bitmap subscription from Light nodes would allow reconstructing node (subscriber of bitmaps) to be fully in control +of deduplication of requested samples. It will allow to make decision on what samples to request and to not have any duplicates being simultaniously requested However, it would also introduce additional complexity and overhead from round trips. +The fact, that each LN has only few samples from the same block and the probability of overlap is low, alternative solution can be to not use bitmaps and request samples without prior knowledge of what samples LN has. FN would send inverse have bitmap in request indicating what it want. +So the tradeoff would be +- Pros: + - No additional round trips between LN and FN + - LN don't need to maintain subscriptions from FN + - LN does not need implement bitmap subscription protocol +- Cons: + - Some samples might be requested multiple times + +To determine which approach to use, we need to know what is duplicates overhead. Monte carlo simulation can be used to estimate the number of duplicates. +Here is summary of the results: + +| Block Size | % Overhead (ln = 256) | % Overhead (ln = 128) | +|------------|-----------------------|-----------------------| +| 16 | 21 | 24 | +| 32 | 22 | 17 | +| 64 | 9 | 4.7 | +| 128 | 2.4 | 1.12 | +| 256 | 0.57 | 0.28 | +| 512 | 0.14 | 0.07 | + +Results show, that overhead is negligible on large block sizes. Given that overhead is negligible on larger blocks, we can use simpler approach of not using SubscribeBitmap protocol for LN and requesting samples without prior knowledge of what samples LN has. + +#### Protocol diagrams +Diagram below outlines protocols proposed above. The detailed specifications of protocol are provided in the subsequent sections. Full flow diagrams are available in the end of this document. ``` -1. Bitmap Subscription - Client Server +1. Bitmap Subscription + Client (FN) Server (FN) |---- Subscribe to bitmap -------------->| |<---- Initial bitmap -------------------| |<---- Updates -------------------------| |<---- End updates(full eds/max samples)-| -2. Data Request - Client Server +2. GetSamples + Client (FN) Server (FN/LN) |---- Request(bitmap) ----------->| |<---- [Samples + Proof] parts ---| ``` + + ## Core Components #### List of Core Components @@ -69,6 +108,7 @@ Diagram below outlines the high-level flow of the proposed protocols. The detail 3. Bitmap subscription protocol 4. Samples retrieval protocol 5. Samples store (new file format) +6. Peer identification ### 1. Reconstruction Process There should be a global per-block coordinator process that will be responsible for managing the data request process. @@ -89,7 +129,7 @@ The first iteration of the decision engine can be implemented as simply as possi - Do not request data that can be derived from other data. Request just enough data for successful reconstruction #### First Implementation: -1. Subscribe to bitmap updates +1. Subscribe to bitmap updates from FN 2. Handle bitmap updates. If any sample is not stored and not in progress, request it from a peer - Keep track of in-progress requests in local state 3. Handle sample responses @@ -139,6 +179,8 @@ func (s *ProgressState) Have() bitmap ``` ### 3. Bitmap Protocol +Bitmap protocol should be implemented by Full Nodes to allow efficient retranslation of samples. It uses bitmaps to sent representation the state of samples stored on Server to allow client to not request samples that it already has. + ### Client - Client should send a request to subscribe to bitmap updates - If the subscription gets closed or interrupted, client should re-subscribe @@ -157,14 +199,12 @@ message SubscribeBitmapRequest { - (Optional): Server can send updates more frequently if there is a significant change in the bitmap - Server should send an end-of-subscription flag when no more updates are expected - Full nodes: stream until EDS is available on server - - Light nodes: stream until max sampling amount is reached - - (Optional): Light node can send a single response with bitmap and end flag upon successful sampling #### Response ``` message BitmapUpdate { Bitmap bitmap = 1; - bool is_end = 2; + bool completed = 2; } ``` @@ -206,8 +246,14 @@ message SampleRequest { } ``` +#### Client +- Client should validate returned samples matches requested bitmap. +- Client should verify proofs. If a proof is invalid, the peer should be penalized + #### Response -Server should respond with samples with proofs defined in shwap CIP [past link]. +- Server should respond with samples with proofs defined in shwap CIP [past link]. +- Server should send samples in a single response. +- If server does not have all requested samples, it should send a partial response with available samples. ```protobuf message SamplesResponse { repeated Sample samples = 1; @@ -234,6 +280,15 @@ initially used for storing ongoing reconstruction process and later can be used - Bitmap subscription support for callbacks - Efficient proof generation to reduce proof size overhead +### 6. Peer Identification + +Peer identification is required by FN to distinguish Full Nodes from Light Nodes, because they will be communicated with different protocols: +- Full Nodes: SubscribeBitmap, GetSamples +- Light Nodes: GetSamples + +Information about the peer can be obtained from the host using user agent or by `libp2p.Identity` protocol. + + ## Backwards Compatibility In the lifespan of the protocol, it may require a coordinated network upgrade. Implementation should allow for: From e506dc679b1a420075afb34ce25e24ca1b731030 Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Mon, 3 Feb 2025 13:01:15 +0100 Subject: [PATCH 06/10] added: - return handshake based on user agent. - Do not subscribe for samples from LN - Request from LN in batches --- docs/adr/reconstruction.md | 354 +++++++++++++++++++------------------ 1 file changed, 183 insertions(+), 171 deletions(-) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index 099ea0c852..3dea095f40 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -1,95 +1,101 @@ ## Abstract -This document proposes a new block reconstruction protocol that addresses key bottlenecks in the current implementation, specifically targeting duplicate request reduction and bandwidth optimization. The protocol introduces a structured approach to data retrieval with an explicit focus on network resource efficiency and scalability. +This document proposes a new block reconstruction protocol that addresses key bottlenecks in the current implementation, particularly focusing on reducing duplicate requests and optimizing bandwidth usage. The protocol provides a structured approach to data retrieval with a clear emphasis on network resource efficiency and scalability. ## Motivation The current block reconstruction protocol faces several limitations: -1. High frequency of duplicate requests leading to network inefficiency + +1. Frequent duplicate requests, resulting in network inefficiency 2. Suboptimal bandwidth utilization -3. Limited scalability with increasing block sizes and node count +3. Limited scalability as block sizes and node counts increase Key improvements include: + - Structured bitmap sharing - Optimized proof packaging - Efficient state management - Robust failure handling -This proposal aims to implement an efficient reconstruction protocol that: +This proposal aims to implement a more efficient reconstruction protocol that: + - Minimizes duplicate data requests - Optimizes bandwidth usage through smart data packaging -- Supports network scaling across multiple dimensions -- Maintains stability under varying network conditions +- Scales effectively across multiple dimensions +- Remains stable under varying network conditions -#### Engineering Time +### Engineering Time -The initial draft aims to be optimized in terms of engineering efforts required for the first iteration of implementation. Optimizations marked as optional can be implemented in subsequent updates based on network performance metrics. +The initial draft targets minimal engineering effort for the first implementation iteration. Optimizations marked as optional can be added later based on observed network performance metrics. ## Specification ### General Performance Requirements -1. Base Scenario Support: - - 32MB block size - - Minimum required light nodes for 32MB blocks - - Network of 50+ full nodes - -2. Performance Metrics (in order of priority): - - System stability - - Reconstruction throughput (blocks/second) - - Per-block reconstruction time - -3. Scalability Dimensions: - - Block size scaling - - Node count scaling - -## Reconstruction Flow - -Once the Full node identifies it cannot fetch the block using shrex protocol, it will start the reconstruction process. -Full node would need to collect samples from connected Light nodes and Full nodes. It should also allow efficient relay of samples to other Full nodes. In order to -achieve this, it will need to implement the following reconstruction flow: -1. **Get samples from LN**. Start process of collecting samples from connected Light nodes - 1. Use GetSamples protocol to get samples from connected Light nodes - 2. To prevent congestion of returned samples, use batching. Request samples from connected Light nodes in batches with fixed size (e.g. 100 LN at a time). -2. **Subscribe to bitmap updates**. Subscribe to bitmap updates from connected Full nodes - 1. Use SubscribeBitmap protocol to subscribe to bitmap updates from connected Full nodes - 2. If returned bitmap has samples that are not present in the node store, request samples from Full node using GetSamples protocol - -#### No bitmap subscription from Light nodes. Why? -Bitmap subscription from Light nodes would allow reconstructing node (subscriber of bitmaps) to be fully in control -of deduplication of requested samples. It will allow to make decision on what samples to request and to not have any duplicates being simultaniously requested However, it would also introduce additional complexity and overhead from round trips. -The fact, that each LN has only few samples from the same block and the probability of overlap is low, alternative solution can be to not use bitmaps and request samples without prior knowledge of what samples LN has. FN would send inverse have bitmap in request indicating what it want. -So the tradeoff would be -- Pros: - - No additional round trips between LN and FN - - LN don't need to maintain subscriptions from FN - - LN does not need implement bitmap subscription protocol -- Cons: - - Some samples might be requested multiple times - -To determine which approach to use, we need to know what is duplicates overhead. Monte carlo simulation can be used to estimate the number of duplicates. -Here is summary of the results: - -| Block Size | % Overhead (ln = 256) | % Overhead (ln = 128) | +1. **Base Scenario Support** + - 32 MB block size + - Minimum required Light Nodes for 32 MB blocks + - Network of 50+ Full Nodes + +2. **Performance Metrics (in order of priority)** + - System stability + - Reconstruction throughput (blocks/second) + - Per-block reconstruction time + +3. **Scalability Dimensions** + - Block size scaling + - Node count scaling + +## Reconstruction Flow + +If a Full Node cannot retrieve a block via the shrex protocol, it initiates the reconstruction process. During this process, the Full Node collects samples from both connected Light Nodes and Full Nodes, and enables efficient sample relaying to other Full Nodes. The following steps outline the reconstruction flow: + +1. **Get samples from LNs.** + - Use the GetSamples protocol to retrieve samples from connected Light Nodes. + - To avoid congestion, request samples in batches (e.g., up to 100 Light Nodes at a time). + +2. **Subscribe to bitmap updates.** + - Use the SubscribeBitmap protocol to receive bitmap updates from connected Full Nodes. + - If the returned bitmap contains samples not present locally, request them from the Full Node using GetSamples. + +### No Bitmap Subscription From Light Nodes: Why? + +Subscribing to bitmaps from Light Nodes would allow the reconstructing node to fully control deduplication of requested samples, preventing simultaneous duplicate requests. However, it also introduces extra complexity and round-trip overhead. Since each Light Node holds only a few samples from any given block and the probability of overlap is low, a simpler approach is to skip bitmap subscriptions for LNs. Instead, the Full Node can send an inverse “have” bitmap in the request to indicate which samples it still needs. + +**Tradeoff:** + +- **Pros** + - No additional round trips between LN and FN + - Light Nodes do not need to maintain subscriptions + - Light Nodes do not need to implement the bitmap subscription protocol + +- **Cons** + - Some samples may be requested multiple times + +To quantify the overhead of duplicate requests, a Monte Carlo simulation was conducted. Below is a summary of the results: + +| Block Size | % Overhead (LN = 256) | % Overhead (LN = 128) | |------------|-----------------------|-----------------------| -| 16 | 21 | 24 | -| 32 | 22 | 17 | -| 64 | 9 | 4.7 | -| 128 | 2.4 | 1.12 | -| 256 | 0.57 | 0.28 | -| 512 | 0.14 | 0.07 | +| 16 | 21 | 24 | +| 32 | 22 | 17 | +| 64 | 9 | 4.7 | +| 128 | 2.4 | 1.12 | +| 256 | 0.57 | 0.28 | +| 512 | 0.14 | 0.07 | -Results show, that overhead is negligible on large block sizes. Given that overhead is negligible on larger blocks, we can use simpler approach of not using SubscribeBitmap protocol for LN and requesting samples without prior knowledge of what samples LN has. +These results show that overhead is negligible for larger block sizes. Therefore, to keep the protocol simpler, we will not use SubscribeBitmap for Light Nodes and will request samples directly. + +### Protocol Diagrams + +Below is an outline of the proposed protocols. Detailed specifications are provided in subsequent sections. Full flow diagrams are at the end of this document. -#### Protocol diagrams -Diagram below outlines protocols proposed above. The detailed specifications of protocol are provided in the subsequent sections. Full flow diagrams are available in the end of this document. ``` 1. Bitmap Subscription Client (FN) Server (FN) |---- Subscribe to bitmap -------------->| |<---- Initial bitmap -------------------| |<---- Updates -------------------------| - |<---- End updates(full eds/max samples)-| + |<---- End updates ----------------------| 2. GetSamples Client (FN) Server (FN/LN) @@ -97,57 +103,61 @@ Diagram below outlines protocols proposed above. The detailed specifications of |<---- [Samples + Proof] parts ---| ``` - - ## Core Components -#### List of Core Components +### List of Core Components -1. Decision engine -2. State management +1. Decision engine +2. State management 3. Bitmap subscription protocol -4. Samples retrieval protocol +4. Samples retrieval protocol 5. Samples store (new file format) 6. Peer identification ### 1. Reconstruction Process -There should be a global per-block coordinator process that will be responsible for managing the data request process. -1. Request Initiation may have multiple strategies: - - Immediate request of all missing samples upon bitmap receipt - - (Optional): Delayed start for bandwidth optimization - - Wait for X% peer responses - - Fixed time delay - - Complete EDS availability - - Pre-confirmation threshold - - Combination of conditions +A global, per-block coordinator should manage the data request process. + +1. **Request initiation** can follow multiple strategies: + - Immediate request for all missing samples upon receiving a bitmap + - (Optional) Delayed request start for bandwidth optimization: + - Wait for X% of peer responses + - Wait for a fixed time delay + - Wait for complete EDS availability + - Wait for a pre-confirmation threshold + - Combination of any conditions + +2. **Select which samples to request** and from which peers. -2. Select which samples to request and from which peers +For the first iteration, keep the decision engine simple to test other components and validate the overall concept. The main goals are: -The first iteration of the decision engine can be implemented as simply as possible to allow easier testing of other components and prove the concept. The base properties should be: - Eliminate requests for duplicate data -- Do not request data that can be derived from other data. Request just enough data for successful reconstruction - -#### First Implementation: -1. Subscribe to bitmap updates from FN -2. Handle bitmap updates. If any sample is not stored and not in progress, request it from a peer - - Keep track of in-progress requests in local state -3. Handle sample responses - - Verify proofs. If a proof is invalid, the peer should be penalized - - Store samples in local store and update local Have state - - Remove samples from in-progress bitmap - - Clean up information about sample coordinates from remote state to free up memory -4. If reconstruction is complete, clean up local state and shut down the reconstruction process +- Avoid requesting data that can be derived from existing data + +#### First Implementation + +1. **Subscribe** to bitmap updates from Full Nodes. +2. **Handle bitmap updates**: if a sample is neither stored locally nor already in-progress, request it from a peer. + - Maintain an in-progress list to avoid duplicate requests. +3. **Handle sample responses**: + - Verify proofs. Penalize peers if proofs are invalid. + - Store samples in the local store and update the local “Have” state. + - Remove the samples from the in-progress bitmap. + - Clean up coordinate information from remote state to free memory. +4. **When reconstruction completes**, clear local state and terminate the reconstruction process. #### Potential Optimizations -- Skip encoded derivable data in response by requesting from peers that have all shares from the same rows/columns -- Optimize proof sizes through range requests -- Optimize proof sizes through subtree proofs if adjacent subroots are stored -- Parallel request distribution to reduce network load on single peers -- Request from peers with smaller latency + +- Skip encoding derivable data by requesting from peers that hold complete rows/columns. +- Optimize proof sizes via range requests. +- Use subtree proofs if adjacent subroots are already stored. +- Distribute requests in parallel to minimize load on individual peers. +- Prefer peers with lower latency. ### 2. State Management -1. Remote State will store information about peers that have samples for given coordinates. If it has full rows/columns, it will be stored in a separate list. + +**Remote State** stores information about which peers hold samples for specific coordinates. If a peer has complete row or column data, it is tracked separately. + ```go type RemoteState struct { coords [][]peers // Peer lists by coordinates @@ -156,150 +166,152 @@ type RemoteState struct { available bitmap // Available samples bitmap } -// Basic peer list structure. Structure might be replaced later to implement better -// peer scoring mechanics +// Basic peer list structure. May be replaced to implement peer scoring. type peers []peer.ID ``` -Query Interface +**Query Interface** -Remote state: ```go func (s *RemoteState) GetPeersWithSample(coords []Coord) []peer.ID func (s *RemoteState) GetPeersWithAxis(axisIdx int, axisType AxisType) []peer.ID func (s *RemoteState) Available() bitmap ``` -Progress state: +**Progress state** tracks ongoing fetch sessions and locally stored samples: + ```go -// Tracks ongoing fetch sessions to prevent requesting duplicates +// Tracks ongoing fetch sessions to prevent duplicate requests func (s *ProgressState) InProgress() bitmap -// Tracks samples that are already stored locally to notify peers about it + +// Tracks samples already stored locally; used to inform peers about local availability func (s *ProgressState) Have() bitmap ``` ### 3. Bitmap Protocol -Bitmap protocol should be implemented by Full Nodes to allow efficient retranslation of samples. It uses bitmaps to sent representation the state of samples stored on Server to allow client to not request samples that it already has. -### Client -- Client should send a request to subscribe to bitmap updates -- If the subscription gets closed or interrupted, client should re-subscribe +Full Nodes implement this protocol to efficiently retransmit samples. It uses bitmaps to indicate which samples the server holds, enabling clients to avoid requesting duplicates. + +#### Client + +- The client sends a request to subscribe to bitmap updates. +- If the subscription is closed or interrupted, the client should re-subscribe. + +**Request** -#### Request ```protobuf message SubscribeBitmapRequest { uint64 height = 1; } ``` -### Server -- Server implements a one-way stream for bitmap updates -- Server should send the first bitmap update immediately -- Next updates should be sent every 5 seconds - - (Optional): Server can send updates more frequently if there is a significant change in the bitmap -- Server should send an end-of-subscription flag when no more updates are expected - - Full nodes: stream until EDS is available on server +#### Server -#### Response -``` +- Implements a one-way stream for bitmap updates. +- Sends the first bitmap update immediately. +- Sends subsequent updates every 5 seconds (or more frequently if significant changes occur). +- Sends an end-of-subscription flag when no more updates are expected. + +**Response** + +```protobuf message BitmapUpdate { Bitmap bitmap = 1; bool completed = 2; } ``` -The protocol utilizes Roaring Bitmaps for efficient bitmap operations and storage. Roaring Bitmaps provide several advantages for the reconstruction protocol: +The protocol uses **Roaring Bitmaps** for efficient operations and storage: + +1. **Efficient Operations** + - Fast logical operations (AND, OR, XOR) + - Optimized for both sparse and dense data + - Memory-efficient storage -1. Efficient Operations - - Fast logical operations (AND, OR, XOR) - - Optimized for sparse and dense data - - Memory-efficient storage +2. **Implementation Benefits** + - Native support for common bitmap operations + - Optimized serialization + - Efficient iteration over set bits + - Rank/select operations available -2. Implementation Benefits - - Native support for common bitmap operations - - Optimized serialization - - Efficient iteration over set bits - - Support for rank/select operations +3. **Performance Characteristics** + - O(1) complexity for most operations + - Compressed storage format + - Efficient memory usage -3. Performance Characteristics - - O(1) for most common operations - - Compressed storage format - - Efficient memory utilization - - Fast bitmap comparisons +The protocol uses 32-bit encoding for broad compatibility. Libraries include: -The protocol will use 32-bit encoding for bitmaps to have greater multi-language support. Implementation can use one of the encoding-compatible libraries: -- Go: https://github.com/RoaringBitmap/roaring -- Rust: https://github.com/RoaringBitmap/roaring-rs -- C++: https://github.com/RoaringBitmap/CRoaring -- Java: https://github.com/RoaringBitmap/RoaringBitmap +- Go: [roaring](https://github.com/RoaringBitmap/roaring) +- Rust: [roaring-rs](https://github.com/RoaringBitmap/roaring-rs) +- C++: [CRoaring](https://github.com/RoaringBitmap/CRoaring) +- Java: [RoaringBitmap](https://github.com/RoaringBitmap/RoaringBitmap) ### 4. Samples Request Protocol -#### Request +Data retrieval uses the **shrex** protocol. The client sends a bitmap indicating which samples it needs. -- Use shrex for data retrieval -- Send bitmap for data request. Bitmap should contain coordinates for requested samples -```protobuf +```protobuf message SampleRequest { - height uint64 = 1; - Bitmap bitmap = 2; + uint64 height = 1; + Bitmap bitmap = 2; } ``` #### Client -- Client should validate returned samples matches requested bitmap. -- Client should verify proofs. If a proof is invalid, the peer should be penalized + +- Validates that returned samples match the requested bitmap. +- Verifies proofs and penalizes peers that provide invalid proofs. #### Response -- Server should respond with samples with proofs defined in shwap CIP [past link]. -- Server should send samples in a single response. -- If server does not have all requested samples, it should send a partial response with available samples. + +- The server responds with samples and proofs, as defined in shwap CIP. +- The server sends all available samples in a single response. If it lacks certain samples, it sends a partial response. + ```protobuf message SamplesResponse { repeated Sample samples = 1; } ``` -#### Optimizations: -- Adjacent samples can have common proofs. Server would need to send shares with common proof in a single response. - Each part contains packed samples in Range format -- If both Row and column are requested, intersection share can be sent once +#### Optimizations + +- Adjacent samples can share common proofs. The server can package shares with common proofs in a single response. +- If both a row and a column are requested, their intersection share need only be sent once. ### 5. Storage Backend -A new storage format needs to be implemented for efficient storage of sample proofs. The format will be -initially used for storing ongoing reconstruction process and later can be used for light node storage. -1. Core Requirements -- Sample storage with proofs -- Allow purge of proofs on successful reconstruction -- Bitmap query support -- Row/column access implementation -- Accessor interface compliance +A new storage format is required for efficient sample proof storage. It will first be used for the ongoing reconstruction process but can later be adapted for Light Node storage. + +1. **Core Requirements** + - Sample storage with proofs + - Ability to purge proofs on successful reconstruction + - Bitmap query support + - Row/column access implementation + - Must comply with the accessor interface -2. Optional Optimizations -- Bitmap subscription support for callbacks -- Efficient proof generation to reduce proof size overhead +2. **Optional Optimizations** + - Bitmap subscription support with callbacks + - Efficient proof generation to reduce proof overhead ### 6. Peer Identification -Peer identification is required by FN to distinguish Full Nodes from Light Nodes, because they will be communicated with different protocols: +Peer identification enables Full Nodes (FNs) to distinguish Full Nodes from Light Nodes (LNs) because each is communicated with via different protocols: + - Full Nodes: SubscribeBitmap, GetSamples - Light Nodes: GetSamples -Information about the peer can be obtained from the host using user agent or by `libp2p.Identity` protocol. - +Peer information can be obtained through the host (e.g., user agent or `libp2p.Identity` protocol). ## Backwards Compatibility -In the lifespan of the protocol, it may require a coordinated network upgrade. Implementation should allow for: -1. Version negotiation -2. Transition period support -3. Fallback mechanisms - +During this protocol’s lifecycle, it may require a coordinated network upgrade. The implementation should support: +1. **Version negotiation** +2. **Transition period support** +3. **Fallback mechanisms** +## Full Reconstruction Process Diagram -## Full reconstruction process diagram ```mermaid sequenceDiagram participant N as New Node From 80377937e7c5a22a399a9211ffcd270d8d4c5665 Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Tue, 11 Feb 2025 22:54:19 +0100 Subject: [PATCH 07/10] Add reconstruction step by step diagram --- docs/adr/reconstruction.md | 72 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 70 insertions(+), 2 deletions(-) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index 3dea095f40..ba7bc5a73e 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -46,7 +46,7 @@ The initial draft targets minimal engineering effort for the first implementatio - Block size scaling - Node count scaling -## Reconstruction Flow +## Reconstruction Overview If a Full Node cannot retrieve a block via the shrex protocol, it initiates the reconstruction process. During this process, the Full Node collects samples from both connected Light Nodes and Full Nodes, and enables efficient sample relaying to other Full Nodes. The following steps outline the reconstruction flow: @@ -54,9 +54,74 @@ If a Full Node cannot retrieve a block via the shrex protocol, it initiates the - Use the GetSamples protocol to retrieve samples from connected Light Nodes. - To avoid congestion, request samples in batches (e.g., up to 100 Light Nodes at a time). -2. **Subscribe to bitmap updates.** +2. **Get samples from FN. Relay samples to other FN.** - Use the SubscribeBitmap protocol to receive bitmap updates from connected Full Nodes. - If the returned bitmap contains samples not present locally, request them from the Full Node using GetSamples. + - Allow other Full Nodes to subscribe to the bitmap. Keep remote nodes updated with the latest state of collected samples bitmap. + - Serve collected samples to other Full Nodes if requested. + +## Reconstruction Flow + +1. **Initial network topology** + 1. Attacker node joins the network, but keeps connections only to Light nodes. + 2. Full nodes are interconnected, and have connections to Light nodes. + 3. Light node 5 is isolated from the attacker node. + + ![0](https://github.com/user-attachments/assets/f28edc73-324e-4760-9a22-cd9aa2126018) + +2. **Headers propagation and LN sample request** + 1. Attacker tries to fool the network by sending a block header and allows LN to request samples. + 2. Light nodes propagate the block header further to the network. First to FN and then FN relay it to isolated LN5. + + ![01](https://github.com/user-attachments/assets/a6086ffd-3f9e-4f60-af6c-eb2ea7e1cb68) + +3. **Start of reconstruction** + 1. FNs tries to download the block using shrex, but fails due to timeout. It triggers reconstruction process. + 2. FNs starts collecting samples from connected LNs. + 3. FNs subscribes to bitmap updates from connected FNs. + + ![1](https://github.com/user-attachments/assets/b5763a82-e73c-4393-9ea7-f4dc1afb2864) + +4. **Bitmap notification** + - FNs collects samples from LNs. + - FNs sends bitmap update to connected FNs. + + ![2](https://github.com/user-attachments/assets/984bedff-8bfc-49df-a35e-be70a05a230a) + +5. **Sample exchange** + - FNs collect samples available in the network. + - FN do not collect samples available locally. + + ![3](https://github.com/user-attachments/assets/1cb1a1a1-7bb1-4f46-afb8-1e2b494e8d7f) + +6. **Sample relaying. Exchange round 2,3...** + - After collecting samples from first round of exchange, FNs continues to send bitmap updates of collected samples to other FNs. + - FN2 collected some samples that are not available on FN1 and FN3. FN1 and FN3 collect samples from FN2. + - It could be more rounds of exchange, if needed. + + ![4](https://github.com/user-attachments/assets/afa3e44d-c1c7-4163-9ea4-18c4040273f9) + +7. **New LN joins the network** + - There is not enough samples to reconstruct the block on each FN. + - New LN joins the network with new set of samples and starts sending samples to FN2. + + ![5](https://github.com/user-attachments/assets/2849d1a1-07bd-4284-b1ce-bcaebf713acd) + +8. **Reconstruct whole block** + - FN2 can now reconstruct the block using erasure decoding. + + ![51](https://github.com/user-attachments/assets/51e81ccb-2107-45d7-b553-f7af1f1cada3) + +9. **Completed reconstruction notification** + - FN2 notifies other FNs that reconstruction it completed reconstruction and all samples are available. + - FN1 and FN3 can now download any missing samples from FN2. In shown case they can request any. For example they can request sample with coord: (X:1Y:0) + + ![6](https://github.com/user-attachments/assets/8a0ef334-95fb-4180-aab2-63bb50ebf1da) + +10. **Reconstruction complete on all FN** + - All FNs can now reconstruct the block. + + ![7](https://github.com/user-attachments/assets/25c5279b-e75d-415a-8c33-5f14c9c9082c) ### No Bitmap Subscription From Light Nodes: Why? @@ -173,8 +238,11 @@ type peers []peer.ID **Query Interface** ```go +// Returns peers that hold the given sample. Necessary for first implementation. func (s *RemoteState) GetPeersWithSample(coords []Coord) []peer.ID +// Returns peers that hold the given axis. Will be used for bandwidth optimization. func (s *RemoteState) GetPeersWithAxis(axisIdx int, axisType AxisType) []peer.ID +// Returns combination of all samples that are available in connected peers func (s *RemoteState) Available() bitmap ``` From 9aaa8823b3366f7b3f24173802866807f1c0cd4c Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Tue, 11 Feb 2025 23:12:42 +0100 Subject: [PATCH 08/10] formatting improvements --- docs/adr/reconstruction.md | 426 ++++++++++++++++++------------------- 1 file changed, 209 insertions(+), 217 deletions(-) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index ba7bc5a73e..46c75849dc 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -1,143 +1,154 @@ +--- + ## Abstract -This document proposes a new block reconstruction protocol that addresses key bottlenecks in the current implementation, particularly focusing on reducing duplicate requests and optimizing bandwidth usage. The protocol provides a structured approach to data retrieval with a clear emphasis on network resource efficiency and scalability. +This document proposes a new block reconstruction protocol to address key bottlenecks in the current implementation, particularly those related to duplicate requests and suboptimal bandwidth usage. By adopting a structured approach to data retrieval, the protocol emphasizes network resource efficiency and scalability. ## Motivation -The current block reconstruction protocol faces several limitations: +The current block reconstruction protocol exhibits the following limitations: -1. Frequent duplicate requests, resulting in network inefficiency -2. Suboptimal bandwidth utilization +1. Frequent duplicate requests, leading to network inefficiency +2. Suboptimal bandwidth usage 3. Limited scalability as block sizes and node counts increase -Key improvements include: +To overcome these challenges, this proposal introduces: - Structured bitmap sharing - Optimized proof packaging - Efficient state management - Robust failure handling -This proposal aims to implement a more efficient reconstruction protocol that: +In doing so, the new protocol aims to: -- Minimizes duplicate data requests -- Optimizes bandwidth usage through smart data packaging -- Scales effectively across multiple dimensions -- Remains stable under varying network conditions +- Minimize duplicate data requests +- Optimize bandwidth usage through smart data packaging +- Scale effectively across a range of block sizes and node counts +- Maintain stability under varying network conditions ### Engineering Time -The initial draft targets minimal engineering effort for the first implementation iteration. Optimizations marked as optional can be added later based on observed network performance metrics. +The initial implementation targets minimal engineering effort. Optimizations marked as optional can be added in subsequent iterations, guided by observed network performance metrics. + +--- ## Specification ### General Performance Requirements 1. **Base Scenario Support** - - 32 MB block size - - Minimum required Light Nodes for 32 MB blocks - - Network of 50+ Full Nodes + - 32 MB block size + - Minimum required Light Nodes for 32 MB blocks + - Network of 50+ Full Nodes 2. **Performance Metrics (in order of priority)** - - System stability - - Reconstruction throughput (blocks/second) - - Per-block reconstruction time + - System stability + - Reconstruction throughput (blocks/second) + - Per-block reconstruction time 3. **Scalability Dimensions** - - Block size scaling - - Node count scaling + - Block size + - Node count + +--- ## Reconstruction Overview -If a Full Node cannot retrieve a block via the shrex protocol, it initiates the reconstruction process. During this process, the Full Node collects samples from both connected Light Nodes and Full Nodes, and enables efficient sample relaying to other Full Nodes. The following steps outline the reconstruction flow: +If a Full Node (FN) cannot retrieve a block via the shrex protocol, it initiates a reconstruction process. During this process, the FN gathers samples from connected Light Nodes (LNs) and Full Nodes, while also relaying samples to other FNs. The following steps outline the reconstruction flow: 1. **Get samples from LNs.** - - Use the GetSamples protocol to retrieve samples from connected Light Nodes. - - To avoid congestion, request samples in batches (e.g., up to 100 Light Nodes at a time). + - Use the `GetSamples` protocol to retrieve samples from connected Light Nodes. + - To avoid congestion, request samples in batches (e.g., up to 100 Light Nodes at a time). + +2. **Get samples from FNs and relay samples to other FNs.** + - Use the `SubscribeBitmap` protocol to receive bitmap updates from connected Full Nodes. + - If the returned bitmap indicates samples not present locally, request them from the Full Node via `GetSamples`. + - Allow other Full Nodes to subscribe to the bitmap, sending them periodic updates. + - Serve collected samples to other Full Nodes when requested. -2. **Get samples from FN. Relay samples to other FN.** - - Use the SubscribeBitmap protocol to receive bitmap updates from connected Full Nodes. - - If the returned bitmap contains samples not present locally, request them from the Full Node using GetSamples. - - Allow other Full Nodes to subscribe to the bitmap. Keep remote nodes updated with the latest state of collected samples bitmap. - - Serve collected samples to other Full Nodes if requested. +--- ## Reconstruction Flow +Below is a step-by-step illustration of the reconstruction flow, with accompanying diagrams. + 1. **Initial network topology** - 1. Attacker node joins the network, but keeps connections only to Light nodes. - 2. Full nodes are interconnected, and have connections to Light nodes. - 3. Light node 5 is isolated from the attacker node. + 1. An attacker node joins the network but connects only to Light Nodes. + 2. Full Nodes are interconnected and also connect to Light Nodes. + 3. Light Node #5 (LN5) is isolated from the attacker node. - ![0](https://github.com/user-attachments/assets/f28edc73-324e-4760-9a22-cd9aa2126018) + 2. **Headers propagation and LN sample request** - 1. Attacker tries to fool the network by sending a block header and allows LN to request samples. - 2. Light nodes propagate the block header further to the network. First to FN and then FN relay it to isolated LN5. + 1. The attacker attempts to deceive the network by sending a block header, prompting LNs to request samples. + 2. Light Nodes propagate the block header further to the network—first to FNs, which then relay it to LN5. - ![01](https://github.com/user-attachments/assets/a6086ffd-3f9e-4f60-af6c-eb2ea7e1cb68) + -3. **Start of reconstruction** - 1. FNs tries to download the block using shrex, but fails due to timeout. It triggers reconstruction process. - 2. FNs starts collecting samples from connected LNs. - 3. FNs subscribes to bitmap updates from connected FNs. +3. **Start of reconstruction** + 1. FNs attempt to download the block via shrex but fail due to a timeout, triggering the reconstruction process. + 2. FNs begin collecting samples from connected LNs. + 3. FNs subscribe to bitmap updates from connected FNs. - ![1](https://github.com/user-attachments/assets/b5763a82-e73c-4393-9ea7-f4dc1afb2864) + 4. **Bitmap notification** - - FNs collects samples from LNs. - - FNs sends bitmap update to connected FNs. - - ![2](https://github.com/user-attachments/assets/984bedff-8bfc-49df-a35e-be70a05a230a) + - FNs collect samples from LNs. + - FNs send bitmap updates to connected FNs. + + -5. **Sample exchange** - - FNs collect samples available in the network. - - FN do not collect samples available locally. +5. **Sample exchange** + - FNs collect samples that are available in the network but do not request those already held locally. - ![3](https://github.com/user-attachments/assets/1cb1a1a1-7bb1-4f46-afb8-1e2b494e8d7f) + -6. **Sample relaying. Exchange round 2,3...** - - After collecting samples from first round of exchange, FNs continues to send bitmap updates of collected samples to other FNs. - - FN2 collected some samples that are not available on FN1 and FN3. FN1 and FN3 collect samples from FN2. - - It could be more rounds of exchange, if needed. - - ![4](https://github.com/user-attachments/assets/afa3e44d-c1c7-4163-9ea4-18c4040273f9) +6. **Sample relaying and subsequent exchange rounds** + - After receiving samples in the first round, FNs continue to send bitmap updates of newly collected samples to other FNs. + - FN2 acquires some samples not available on FN1 or FN3, which then request these samples from FN2. + - Additional exchange rounds may occur if needed. + + 7. **New LN joins the network** - - There is not enough samples to reconstruct the block on each FN. - - New LN joins the network with new set of samples and starts sending samples to FN2. + - Not enough samples are available for each FN to reconstruct the block. + - A new LN joins with a fresh set of samples and starts sending them to FN2. - ![5](https://github.com/user-attachments/assets/2849d1a1-07bd-4284-b1ce-bcaebf713acd) + -8. **Reconstruct whole block** - - FN2 can now reconstruct the block using erasure decoding. +8. **Reconstruction is possible now** + - FN2 can now reconstruct the block via erasure decoding. - ![51](https://github.com/user-attachments/assets/51e81ccb-2107-45d7-b553-f7af1f1cada3) + 9. **Completed reconstruction notification** - - FN2 notifies other FNs that reconstruction it completed reconstruction and all samples are available. - - FN1 and FN3 can now download any missing samples from FN2. In shown case they can request any. For example they can request sample with coord: (X:1Y:0) + - FN2 notifies other FNs that it has finished reconstruction and now holds all samples. + - FN1 and FN3 can request any missing samples from FN2. For instance, they might request the sample at (X:1, Y:0). + + - ![6](https://github.com/user-attachments/assets/8a0ef334-95fb-4180-aab2-63bb50ebf1da) +10. **Reconstruction complete on all FNs** +- All FNs successfully reconstruct the block. -10. **Reconstruction complete on all FN** - - All FNs can now reconstruct the block. + - ![7](https://github.com/user-attachments/assets/25c5279b-e75d-415a-8c33-5f14c9c9082c) +--- ### No Bitmap Subscription From Light Nodes: Why? -Subscribing to bitmaps from Light Nodes would allow the reconstructing node to fully control deduplication of requested samples, preventing simultaneous duplicate requests. However, it also introduces extra complexity and round-trip overhead. Since each Light Node holds only a few samples from any given block and the probability of overlap is low, a simpler approach is to skip bitmap subscriptions for LNs. Instead, the Full Node can send an inverse “have” bitmap in the request to indicate which samples it still needs. +Subscribing to bitmaps from Light Nodes would allow more precise deduplication of requested samples, but it also introduces additional complexity and round-trip overhead. Since each Light Node holds relatively few samples, the probability of overlap is low. Instead, the Full Node can send an inverse “have” bitmap in its request to indicate which samples are still needed. -**Tradeoff:** +**Tradeoff**: - **Pros** - - No additional round trips between LN and FN - - Light Nodes do not need to maintain subscriptions - - Light Nodes do not need to implement the bitmap subscription protocol + - Fewer round trips between LN and FN + - Light Nodes do not need to maintain subscriptions + - Light Nodes do not need to implement the bitmap subscription protocol - **Cons** - - Some samples may be requested multiple times + - Some samples may be requested multiple times -To quantify the overhead of duplicate requests, a Monte Carlo simulation was conducted. Below is a summary of the results: +A Monte Carlo simulation was conducted to quantify overhead from potential duplicate requests, summarized below: | Block Size | % Overhead (LN = 256) | % Overhead (LN = 128) | |------------|-----------------------|-----------------------| @@ -148,11 +159,13 @@ To quantify the overhead of duplicate requests, a Monte Carlo simulation was con | 256 | 0.57 | 0.28 | | 512 | 0.14 | 0.07 | -These results show that overhead is negligible for larger block sizes. Therefore, to keep the protocol simpler, we will not use SubscribeBitmap for Light Nodes and will request samples directly. +The data indicates that overhead is negligible for larger block sizes. To keep the protocol simpler, **SubscribeBitmap** is not used for Light Nodes; instead, FNs request samples directly. + +--- ### Protocol Diagrams -Below is an outline of the proposed protocols. Detailed specifications are provided in subsequent sections. Full flow diagrams are at the end of this document. +Below is an outline of the proposed protocols. Full flow diagrams appear at the end of this document. ``` 1. Bitmap Subscription @@ -168,6 +181,8 @@ Below is an outline of the proposed protocols. Detailed specifications are provi |<---- [Samples + Proof] parts ---| ``` +--- + ## Core Components ### List of Core Components @@ -179,49 +194,53 @@ Below is an outline of the proposed protocols. Detailed specifications are provi 5. Samples store (new file format) 6. Peer identification +--- + ### 1. Reconstruction Process -A global, per-block coordinator should manage the data request process. +A global, per-block coordinator should oversee the data request process. -1. **Request initiation** can follow multiple strategies: - - Immediate request for all missing samples upon receiving a bitmap - - (Optional) Delayed request start for bandwidth optimization: - - Wait for X% of peer responses - - Wait for a fixed time delay - - Wait for complete EDS availability - - Wait for a pre-confirmation threshold - - Combination of any conditions +1. **Request initiation** can follow various strategies: + - Immediate requests for all missing samples upon receiving a bitmap + - (Optional) Delayed requests for bandwidth optimization: + - Wait for X% of peer responses + - Wait for a fixed time delay + - Wait for complete EDS availability + - Wait for a pre-confirmation threshold + - Any combination of the above -2. **Select which samples to request** and from which peers. +2. **Select which samples to request**, and from which peers. -For the first iteration, keep the decision engine simple to test other components and validate the overall concept. The main goals are: +For the first iteration, keep the decision engine simple to validate other components. Primary goals: - Eliminate requests for duplicate data -- Avoid requesting data that can be derived from existing data +- Avoid requesting data already derivable from existing information -#### First Implementation +#### First Implementation Steps 1. **Subscribe** to bitmap updates from Full Nodes. -2. **Handle bitmap updates**: if a sample is neither stored locally nor already in-progress, request it from a peer. - - Maintain an in-progress list to avoid duplicate requests. +2. **Handle bitmap updates**: If a sample is neither stored locally nor already being fetched, request it from a peer. + - Keep an in-progress list to avoid duplicate requests. 3. **Handle sample responses**: - - Verify proofs. Penalize peers if proofs are invalid. - - Store samples in the local store and update the local “Have” state. - - Remove the samples from the in-progress bitmap. - - Clean up coordinate information from remote state to free memory. -4. **When reconstruction completes**, clear local state and terminate the reconstruction process. + - Verify proofs. Penalize peers if proofs are invalid. + - Store samples in the local store and update the “Have” state. + - Remove those samples from the in-progress list. + - Clean up coordinate metadata to free memory. +4. **On reconstruction completion**, clear the local state and end the process. #### Potential Optimizations -- Skip encoding derivable data by requesting from peers that hold complete rows/columns. -- Optimize proof sizes via range requests. +- Skip encoding derivable data by requesting complete rows/columns from peers. +- Reduce proof sizes with range requests. - Use subtree proofs if adjacent subroots are already stored. -- Distribute requests in parallel to minimize load on individual peers. +- Distribute requests in parallel to balance load. - Prefer peers with lower latency. +--- + ### 2. State Management -**Remote State** stores information about which peers hold samples for specific coordinates. If a peer has complete row or column data, it is tracked separately. +**Remote State** holds information about which peers have specific samples. If a peer stores a complete row or column, it is tracked separately: ```go type RemoteState struct { @@ -231,41 +250,48 @@ type RemoteState struct { available bitmap // Available samples bitmap } -// Basic peer list structure. May be replaced to implement peer scoring. +// Basic peer list structure. Can be extended to implement peer scoring. type peers []peer.ID ``` -**Query Interface** +**Query Interface**: ```go -// Returns peers that hold the given sample. Necessary for first implementation. +// Returns peers that hold the given samples. +// Required for the first implementation. func (s *RemoteState) GetPeersWithSample(coords []Coord) []peer.ID -// Returns peers that hold the given axis. Will be used for bandwidth optimization. + +// Returns peers that hold a particular row or column. +// Used for bandwidth optimization. func (s *RemoteState) GetPeersWithAxis(axisIdx int, axisType AxisType) []peer.ID -// Returns combination of all samples that are available in connected peers + +// Returns a bitmap representing all samples available among connected peers. func (s *RemoteState) Available() bitmap ``` -**Progress state** tracks ongoing fetch sessions and locally stored samples: +**Progress State** tracks ongoing fetches and locally stored samples: ```go -// Tracks ongoing fetch sessions to prevent duplicate requests +// Tracks samples currently being fetched, preventing duplicates. func (s *ProgressState) InProgress() bitmap -// Tracks samples already stored locally; used to inform peers about local availability +// Tracks samples already stored locally, +// used to inform peers about local availability. func (s *ProgressState) Have() bitmap ``` +--- + ### 3. Bitmap Protocol -Full Nodes implement this protocol to efficiently retransmit samples. It uses bitmaps to indicate which samples the server holds, enabling clients to avoid requesting duplicates. +Full Nodes use this protocol to efficiently signal available samples. It uses bitmaps to indicate which samples the server holds, enabling clients to avoid requesting duplicates. #### Client -- The client sends a request to subscribe to bitmap updates. -- If the subscription is closed or interrupted, the client should re-subscribe. +- Sends a request to subscribe to bitmap updates. +- If the subscription closes or is interrupted, re-subscribe. -**Request** +**Request**: ```protobuf message SubscribeBitmapRequest { @@ -275,12 +301,12 @@ message SubscribeBitmapRequest { #### Server -- Implements a one-way stream for bitmap updates. -- Sends the first bitmap update immediately. -- Sends subsequent updates every 5 seconds (or more frequently if significant changes occur). -- Sends an end-of-subscription flag when no more updates are expected. +- Implements a one-way stream for sending bitmap updates. +- Sends the initial bitmap immediately. +- Sends further updates periodically (e.g., every 5 seconds) or sooner when there are significant changes. +- Sends a final update indicating subscription completion when no more updates are expected. -**Response** +**Response**: ```protobuf message BitmapUpdate { @@ -289,34 +315,35 @@ message BitmapUpdate { } ``` -The protocol uses **Roaring Bitmaps** for efficient operations and storage: +**Roaring Bitmaps** are used for efficient operations and storage: 1. **Efficient Operations** - - Fast logical operations (AND, OR, XOR) - - Optimized for both sparse and dense data - - Memory-efficient storage + - Fast logical operations (AND, OR, XOR) + - Handles both sparse and dense data well + - Memory-efficient 2. **Implementation Benefits** - - Native support for common bitmap operations - - Optimized serialization - - Efficient iteration over set bits - - Rank/select operations available + - Common library support in Go, Rust, C++, Java + - Optimized serialization + - Rank/select operations for advanced queries 3. **Performance Characteristics** - - O(1) complexity for most operations - - Compressed storage format - - Efficient memory usage - -The protocol uses 32-bit encoding for broad compatibility. Libraries include: + - O(1) complexity for most operations + - Compressed storage format + - Efficient memory usage +The protocol uses a 32-bit encoding for broad compatibility. +Available implementations: - Go: [roaring](https://github.com/RoaringBitmap/roaring) - Rust: [roaring-rs](https://github.com/RoaringBitmap/roaring-rs) - C++: [CRoaring](https://github.com/RoaringBitmap/CRoaring) - Java: [RoaringBitmap](https://github.com/RoaringBitmap/RoaringBitmap) +--- + ### 4. Samples Request Protocol -Data retrieval uses the **shrex** protocol. The client sends a bitmap indicating which samples it needs. +Data retrieval uses the **shrex** protocol. The client sends a bitmap specifying needed samples: ```protobuf message SampleRequest { @@ -328,12 +355,11 @@ message SampleRequest { #### Client - Validates that returned samples match the requested bitmap. -- Verifies proofs and penalizes peers that provide invalid proofs. +- Verifies proofs and penalizes peers for invalid proofs. #### Response -- The server responds with samples and proofs, as defined in shwap CIP. -- The server sends all available samples in a single response. If it lacks certain samples, it sends a partial response. +The server returns the requested samples and their proofs (as specified in the shwap CIP). If it lacks certain samples, it sends a partial response. ```protobuf message SamplesResponse { @@ -341,112 +367,78 @@ message SamplesResponse { } ``` -#### Optimizations +#### Possible Optimizations -- Adjacent samples can share common proofs. The server can package shares with common proofs in a single response. -- If both a row and a column are requested, their intersection share need only be sent once. +- Combine adjacent samples that share proofs into a single response. +- If both a row and a column are requested, their intersection share should be sent only once. + +--- ### 5. Storage Backend -A new storage format is required for efficient sample proof storage. It will first be used for the ongoing reconstruction process but can later be adapted for Light Node storage. +A new storage format is required for efficient sample proof handling. Initially, it will be used only for reconstruction, but could be extended for Light Nodes in the future. 1. **Core Requirements** - - Sample storage with proofs - - Ability to purge proofs on successful reconstruction - - Bitmap query support - - Row/column access implementation - - Must comply with the accessor interface + - Store samples with proofs + - Purge proofs upon successful reconstruction + - Provide a bitmap query interface + - Support row/column data retrieval + - Comply with the accessor interface 2. **Optional Optimizations** - - Bitmap subscription support with callbacks - - Efficient proof generation to reduce proof overhead + - Integrate with a bitmap subscription mechanism + - Efficient proof generation to reduce overhead + +--- ### 6. Peer Identification -Peer identification enables Full Nodes (FNs) to distinguish Full Nodes from Light Nodes (LNs) because each is communicated with via different protocols: +Peer identification ensures Full Nodes can distinguish Full Nodes from Light Nodes, as each follows different protocols: -- Full Nodes: SubscribeBitmap, GetSamples -- Light Nodes: GetSamples +- Full Nodes: `SubscribeBitmap`, `GetSamples` +- Light Nodes: `GetSamples` -Peer information can be obtained through the host (e.g., user agent or `libp2p.Identity` protocol). +Peer information can be inferred from the host (e.g., user agent or a `libp2p.Identity` protocol). + +--- ## Backwards Compatibility -During this protocol’s lifecycle, it may require a coordinated network upgrade. The implementation should support: +A coordinated network upgrade may be required as this protocol evolves. The implementation should support: 1. **Version negotiation** 2. **Transition period support** 3. **Fallback mechanisms** +--- + ## Full Reconstruction Process Diagram +
```mermaid -sequenceDiagram - participant N as New Node - participant RS as Remote State - participant RP as Reconstruction Processor - participant PS as Progress State - participant SS as Samples Store - participant P1 as Peer 1 - participant P2 as Peer 2 - participant P3 as Peer 3 - - Note over N,P3: Phase 1: Bitmap Discovery - N->>+P1: Subscribe to bitmap updates - N->>+P2: Subscribe to bitmap updates - N->>+P3: Subscribe to bitmap updates - - P1-->>-N: Initial bitmap - P2-->>-N: Initial bitmap - P3-->>-N: Initial bitmap - - N->>RS: Update remote state - - Note over N,P3: Phase 2: Reconstruction Process Start - N->>RP: Initialize reconstruction - activate RP - - loop Process Bitmaps - RP->>PS: Check progress state - RP->>RS: Query available samples - - Note over RP: Select optimal samples & peers - - par Request Samples - RP->>P1: GetSamples(bitmap subset 1) - RP->>P2: GetSamples(bitmap subset 2) - RP->>P3: GetSamples(bitmap subset 3) - end - - PS->>PS: Mark requests as in-progress - end - - Note over N,P3: Phase 3: Sample Processing - par Process Responses - P1-->>RP: Samples + Proofs 1 - P2-->>RP: Samples + Proofs 2 - P3-->>RP: Samples + Proofs 3 - end - - loop For each response - RP->>SS: Verify & store samples - RP->>PS: Update progress - RP->>RS: Update available samples - end - - Note over N,P3: Phase 4: Completion - opt Reconstruction Complete - RP->>SS: Finalize reconstruction - RP->>PS: Clear progress state - RP->>RS: Clear remote state - deactivate RP - end - - Note over N,P3: Phase 5: Continuous Updates - loop Until Complete - P1-->>N: Bitmap updates - P2-->>N: Bitmap updates - P3-->>N: Bitmap updates - N->>RS: Update remote state - end -``` \ No newline at end of file +flowchart TB + A((Start)) --> B{Retrieve block\nvia shrex?} + B -- "Yes" --> C["Block fully retrieved\nvia shrex\n(Stop reconstruction)"] + B -- "No / Timeout" --> D["Initiate reconstruction"] + + D --> E["Subscribe to\nbitmap updates"] + D --> F["Request LN samples\n(in batches)"] + + E --> G["Receive FNs' bitmaps"] + F --> H["Receive LN samples"] + + G --> I["Update local storage\n(Have state)"] + H --> I + + I --> J["Publish local bitmap\nto FNs"] + + J --> K{"Sufficient samples\nto decode block?"} + K -- "No" --> L["Wait for more samples\n(LN/FN/new LN)"] + L --> I + + K -- "Yes" --> M["Decode block\n(erasure coding)"] + M --> N["Notify network\nof complete sample set"] + N --> O((Done\nReconstruction complete)) + +``` +
From 8d2a7aea8bfd19c4d19a146d8eaaa5e4e1099254 Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Wed, 12 Feb 2025 13:13:39 +0100 Subject: [PATCH 09/10] add sampling protocol motivation --- docs/adr/reconstruction.md | 62 ++++++++++++++++++++++++++------------ 1 file changed, 42 insertions(+), 20 deletions(-) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index 46c75849dc..de49c6881f 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -68,6 +68,21 @@ If a Full Node (FN) cannot retrieve a block via the shrex protocol, it initiates --- +## Sampling Protocol + +The current implementation of the sampling protocol relies on [bitswap](https://github.com/ipfs/go-bitswap) to request samples from Full (or Bridge) nodes. Under normal assumptions—when data is reliably available—bitswap works well. It provides a robust system for fetching data by content identifier (CID), along with advanced features for load distribution and prioritization. + +However, bitswap has a critical limitation: **it lacks content discovery**. When content discovery is absent, a node must attempt requests with each peer until it finds one that actually has the data. This becomes especially problematic during reconstruction. In that scenario, only the attacker nodes hold the block data initially; honest Full Nodes do not. As a result: + +1. Light Nodes (LNs) repeatedly attempt to fetch samples from honest Full Nodes, which cannot serve these requests because they don’t yet have the block data. +2. This generates excessive “spam” requests directed at nodes that are in the process of reconstructing the block, ultimately degrading performance. + +**Replace bitswap with shrex-based Samples protocol.** + +To resolve this issue, the protocol must incorporate a more effective content discovery mechanism. Fortunately, the **shrex-based Samples protocol**, already required for reconstruction, can also serve LN sampling needs. By adopting this unified solution, we can eventually phase out bitswap for **all** data retrieval tasks, removing it as a dependency and streamlining the system’s architecture. + +--- + ## Reconstruction Flow Below is a step-by-step illustration of the reconstruction flow, with accompanying diagrams. @@ -134,6 +149,26 @@ Below is a step-by-step illustration of the reconstruction flow, with accompanyi --- +### Protocol Diagrams + +Below is an outline of the proposed protocols. Full flow diagrams appear at the end of this document. + +``` +1. Bitmap Subscription + Client (FN) Server (FN) + |---- Subscribe to bitmap -------------->| + |<---- Initial bitmap -------------------| + |<---- Updates -------------------------| + |<---- End updates ----------------------| + +2. GetSamples + Client (FN) Server (FN/LN) + |---- Request(bitmap) ----------->| + |<---- [Samples + Proof] parts ---| +``` + +--- + ### No Bitmap Subscription From Light Nodes: Why? Subscribing to bitmaps from Light Nodes would allow more precise deduplication of requested samples, but it also introduces additional complexity and round-trip overhead. Since each Light Node holds relatively few samples, the probability of overlap is low. Instead, the Full Node can send an inverse “have” bitmap in its request to indicate which samples are still needed. @@ -163,26 +198,6 @@ The data indicates that overhead is negligible for larger block sizes. To keep t --- -### Protocol Diagrams - -Below is an outline of the proposed protocols. Full flow diagrams appear at the end of this document. - -``` -1. Bitmap Subscription - Client (FN) Server (FN) - |---- Subscribe to bitmap -------------->| - |<---- Initial bitmap -------------------| - |<---- Updates -------------------------| - |<---- End updates ----------------------| - -2. GetSamples - Client (FN) Server (FN/LN) - |---- Request(bitmap) ----------->| - |<---- [Samples + Proof] parts ---| -``` - ---- - ## Core Components ### List of Core Components @@ -193,6 +208,7 @@ Below is an outline of the proposed protocols. Full flow diagrams appear at the 4. Samples retrieval protocol 5. Samples store (new file format) 6. Peer identification +7. Light node sampling protocol --- @@ -402,6 +418,12 @@ Peer information can be inferred from the host (e.g., user agent or a `libp2p.Id --- +### 7. Light node sampling protocol + +A new protocol for Light Nodes to acquire block samples directly from Full Nodes, **based on the Samples Retrieval Protocol (item #4)**. By adopting this approach, we can phase out bitswap and unify the mechanisms used for both normal sampling and reconstruction. Key advantages include: + +--- + ## Backwards Compatibility A coordinated network upgrade may be required as this protocol evolves. The implementation should support: From e0725527a389dcd791aecbae4d56ad7ace918e99 Mon Sep 17 00:00:00 2001 From: Vlad <13818348+walldiss@users.noreply.github.com> Date: Wed, 12 Feb 2025 14:18:33 +0100 Subject: [PATCH 10/10] add false positive --- docs/adr/reconstruction.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/docs/adr/reconstruction.md b/docs/adr/reconstruction.md index de49c6881f..685e9afe86 100644 --- a/docs/adr/reconstruction.md +++ b/docs/adr/reconstruction.md @@ -424,6 +424,28 @@ A new protocol for Light Nodes to acquire block samples directly from Full Nodes --- +## False Positive Overhead + +In certain rare scenarios, a Full Node (FN) might fail to obtain a block via shrex due to transient networking issues or software bugs. As a result, it may erroneously enter the reconstruction process (a “false positive”). This false alarm creates unnecessary overhead on the network, as the node begins sending sample requests to Light Nodes and setting up bitmap subscriptions. Although the likelihood of such events is relatively low, it is still worth considering potential mitigation strategies: + +### Potential Mitigation Strategies + +1. **Lower the Chance of a False Positive** + - **Track Multiple `NOT_HAVE` Responses** + If an FN sees repeated `NOT_HAVE` responses over an extended period, it becomes more likely that the block truly isn’t available on the network. This reduces the chance of prematurely triggering reconstruction. + - **Extend Shrex Timeouts with Caution** + If repeated attempts to fetch the block fail, the FN could increase its shrex timeouts while gathering enough evidence that the block is not available. However, this approach must guard against attackers who might falsely claim to have the block but never serve it (i.e., withholding attacks). + +2. **Lower the Impact of a False Positive** + - **Staged Reconstruction** + Allow the node to run shrex in the background while performing reconstruction in phases. Initially, it might only subscribe to bitmaps lightly, ramping up resource usage gradually if it still can’t locate the block. This approach reduces unnecessary load on the network if the trigger was false, but it may also delay reconstruction if the block genuinely is unavailable. + +--- + +All of these methods are optional considerations. **Ensuring robust reconstruction remains a higher priority**; therefore, these false positive mitigations can be evaluated and implemented at a later time if deemed necessary. + +--- + ## Backwards Compatibility A coordinated network upgrade may be required as this protocol evolves. The implementation should support: