Skip to content

Commit

Permalink
doc: page on DDoS mitigation for the chat
Browse files Browse the repository at this point in the history
  • Loading branch information
darkfi committed Jan 28, 2025
1 parent 0133a33 commit 14e60de
Show file tree
Hide file tree
Showing 2 changed files with 118 additions and 0 deletions.
1 change: 1 addition & 0 deletions doc/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
- [DAO](arch/dao.md)
- [DEX](arch/dex.md)
- [Wallet](arch/wallet.md)
- [Chat](arch/chat.md)
- [zkas](zkas/index.md)
- [Writing ZK Proofs](zkas/writing-zk-proofs.md)
- [Bincode](zkas/bincode.md)
Expand Down
117 changes: 117 additions & 0 deletions doc/src/arch/chat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# DDoS Mitigation

## Improve Sync Algo

Right now the algo for syncing is very bad. It looks through every available
channel sequentially and performs these steps:

1. get the current tips
2. request their parents
3. repeat

The entire process is the slowest sync algo. Lets think of a better sync algo.

### Graph Locator

```
struct GraphLocator {
tip_slots: Vec<EventHash>
}
```

Each tip slot contains a *sparse* path from an active tip to the current root.
Sparse here means the path is not complete.

The slot is: current tip, a continuous path for the next 5 and then from there,
gaps which double for every event.

When a node receives the graph locator, it can them compare it with its own
graph and immediately see where their graphs both diverge.

It then will sync from the divergence point.

### Phased Algo

Events are increasingly carrying more data. Therefore the graph structure should
be separated from the event data.

```
struct EventHeader {
parents: Vec<EventHash>
blob: EventDataHash
}
```

The sync algo then becomes:

1. **Locate phase:** Our node creates a GraphLocator and sends it to the remote.
The remote uses this to see which events we're missing.
The remote then sends a *capped* flood of Inventory objects.
1. Our node may need to repeat sending the GraphLocator requests since
the remote only sends a capped number of updates.
2. **Header sync phase:** Our node sends GetData request containing lists of the
missing event header hashes. The remote responds with event header objects.
Our node then links these up to build the missing graph structure.
3. **Data sync phase:** Lastly our node walks the graph and for every missing
blob, requests the data. It does this backwards.

### Effect on DDoS

While improved sync is desirable, it still does not mitigate DDoS since the
attacker can easily double their resources. This merely makes nodes more
performant.

## Protocol Restriction

Proposal: add a global rate limit for messages. This is the easiest and most
direct fix we should apply right now.

There is a rolling window of 1 minute with a maximum of 100 events allowed.
Violators get `channel.ban()`.

Use a `VecDeque` of timestamps. `clean_bantimes()` will then do:

```rust
while let Some(ts) = bantimes.front() {
if ts > NOW - 1 minute {
break
}
let _ = bantimes.pop_front()
}
```

Now all events within the list should be within the last minute. Push any new
events to the list, then check:

```rust
if bantimes.len() > N {
channel.ban();
}
```

This is the most immediate mitigation and overdue anyway.

Later we will make the 1 minute and N configurable with the static event graph
admin instance. For now, we can hardcode these values.

## Resource Manager

This will also be a big step towards alleviating DDoS. Check the p2p doc for a
design.

## RLN

The global limit is not ideal since it's... global. However RLN fixes this since
it provides a way for people to not be affected by the rate-limit. Problem
solved.

## Spam

Just regular spam. Solved by admin/mod features and outside the scope of this
doc. We need the static event graph though so we can lock channels down during
times of high activity or modify the posting rate of certain keys (like public
ones).

We already had the first step now with the `/ban` feature and admin keys in the
TOML.

0 comments on commit 14e60de

Please sign in to comment.