Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Parent selection based on node state awareness #3713

Open
baowj-678 opened this issue Dec 19, 2024 · 3 comments
Open

[RFC] Parent selection based on node state awareness #3713

baowj-678 opened this issue Dec 19, 2024 · 3 comments

Comments

@baowj-678
Copy link

baowj-678 commented Dec 19, 2024

Introduction

Feature request:

Dragonfly is an efficient, stable, and secure file distribution and image acceleration tool based on P2P. However, currently the Parent selection method for downloading Dragonfly file pieces is based on the FCFS method (i.e., a certain Piece Metadata is obtained from which Parent first, and the corresponding Piece is downloaded from that Parent). This node selection method cannot dynamically perceive changes in Parent node state (network bandwidth, disk IO) and cannot fully utilize bandwidth resources.

Therefore, I propose a download node selection method based on Parent state awareness, which will be introduced in detail below.

Use case:

UI Example:

Design

Architecture

The following is the overall architecture diagram of the design, mainly including ParentStateSyncer, ParentStateServer and PieceCollector are three parts.

dragonfly-带宽感知-实现 drawio

Modules

  • ParentStateServer: The backend daemon thread on the upload server side, which periodically counts the local network bandwidth and disk bandwidth then calculates the local node state, and sends the latest state to each connection in the SyncHost connection set maintained by LRU;
  • ParentStateSyncer: The backend daemon thread on the client side, which uses LRU cache to maintain the set of Parents that need to synchronize their states, and sends SyncHost requests to synchronize all parent statuses in the cache;
  • PieceCollector: Retrieve the states of the parents being followed from ParentStateSyncer, select the optimal download parent and its corresponding piece;

Download Process

  1. Start downloading, PieceCollector registers the scheduled parents into the LRU cache of ParentStateSyncer;
  2. ParentStateSyncer synchronizes the parents' states from ParentStateServer in the background;
  3. **PieceCollector **periodically updates the state of the parents it focuses on from ParentStateSyncer;
  4. PieceCollector obtains the scheduled parents and their corresponding pieces based on the node selection method;

Node Selection Method

dragonfly-带宽感知-算法 drawio

  1. The downloaded piece-metadata is saved in different queues according to the parent;
  2. Based on the parent status, use a random number to select a parent;
  3. [normal case] If there is piece-metadata in the parent queue, select the first element directly;
  4. [queue empty] Select the next parent queue in order;
  5. [piece finished] Skip until the queued piece has not been downloaded or the queue is empty;

Configuration

upload:
  # configuration for HostSyncer
  syncer:
    # enable indicates whether enable HostSyncer.
    enable: true
    # intervalis the interval to sync hosts' info.
    interval: 3s
    # cache_capacity is the capacity of the cache by LRU algorithm for HostSyncer grpc connection, default is 50.
    cacheCapacity: 50

API Definition

message SyncHostRequest {
  // Host id.
  string host_id = 1;
  // Peer id.
  string peer_id = 2;
}

// DfdaemonUpload represents upload service of dfdaemon.
service DfdaemonUpload{
  // SyncHost sync parents state.
  rpc SyncHost(SyncHostRequest) returns (stream common.v2.Host);
}
@baowj-678
Copy link
Author

actions:

  • api define, week1
  • configuration, week1
  • upload server, week1
  • parent syncer, week2
  • piece collector, week2
  • test, unit test & e2e test & stress test, week3

@CormickKneey
Copy link
Contributor

Impressive design!
Regarding the part about perceive changes in Parent node state, collecting node metrics is a complex task. Perhaps we could consider using the OpenTelemetry metrics to interface with the data collection daemons which usually already exist in most production environments, rather than implementing this part ourselves. This way, we only need to design a mechanism that adjusts the scheduling weight of the current node based on metrics.

@baowj-678
Copy link
Author

Impressive design! Regarding the part about perceive changes in Parent node state, collecting node metrics is a complex task. Perhaps we could consider using the OpenTelemetry metrics to interface with the data collection daemons which usually already exist in most production environments, rather than implementing this part ourselves. This way, we only need to design a mechanism that adjusts the scheduling weight of the current node based on metrics.

Your suggestion is very good. But we believe that. Firstly, the node state data (such as real-time bandwidth) that our method relies on requires strong real-time performance, which may not be achievable if collected using OpenTelemetry. Secondly, our approach is the basic functionality of dragonfly, and using OpenTelemetry may lead to excessive dependency issues in the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants