Refactor to modules #122

ryardley · 2024-10-01T09:07:06Z

Design has emerged from hacking on the prototype. We should refactor to use actor boundaries as modules. This might take an afternoon. There has been some pushback on the actor model but here is a discussion and a brain-dump around it from my perspective.

Red, Green, Refactor Concept

The concept of Red, Green, Refactor is common in Test-Driven Development (TDD) circles. It refers to the three stages of a software development process that relies on very short development cycles. This concept is also useful when applied to larger projects.

We started with code that didn't work (Red) and our design has now emerged from the prototype and tests (Green). We should now (Refactor) to ensure the code is easy to maintain moving forward.

There is not much to refactor in terms of structure as we have been refactoring as we have been going and have a highly cohesive and loosely coupled system but it is time for the codebase to be split out to form stronger module boundaries.

Good Features of the Current Design

Several positive principles have become apparent::

Event bus facilitating decoupled communication between modules
ActorModel for non-blocking intra-module calls for sync data requests
Individual module testing utilizing the event bus
Easy addition and configuration of functionality by attaching actors to the event bus. Eg. here and here
Code modules focusing on specific, small elements of functionality and business logic
Decoupled modules allow for complex testable peer-to-peer interactions
Modular nature allowing easy configuration and creation of different network roles and binary types
Potential for asynchronous keyshare processing pipeline
Flexibility around failure scenarios
Elements can remain decoupled over the event bus or by utilizing the Recipient struct
Eventbus / Actor model messages appear to form the domain and our logical event boundaries

Issues and next steps

This has been a scramble to get this design together but each actor in the design is pretty well decomposed for the current problem.

Currently most of the emergent modules are contained within the core library and as such it is time to neaten up and refactor out to modules as the boundaries of the modules are clearer.

This happened since as we were iterating we were attempting to isolate the actor system in core to avoid dependency creep just incase we needed to switch out the actor library or try other approaches. This has appeared to have been a poor decision as much of the functionality has moved in core where it should not be really.

It was suggested at one point to use the actor library more as a message passing system however if we were to extract out non-actor bound elements to modules that accept and send messages we simply add a layer of indirection and extra boiler plate as a consequence. Eg.

impl Handler<KeyshareAggregated> for FooActor {
  type Response = Result<()>; 
  fn handler(self, msg:KeyshareAggregated, ..) -> Result<()> {
    self.module.on_keyshare_aggregated(msg)?
  }
}

impl FooModule {
  fn on_keyshare_aggregated(msg:KeyshareAggregated) -> Result<()> {
    //... etc etc
  }
}

NOTE: all this would be extra boilerplate for every event handler

So it makes more sense to share the actor framework and utilize it and it's event handlers as the boundary for each module.

Secondly there is still certain components that could avoid being actors where simple functions could suffice. We have decomposed the evm actors into functions - there are some other examples that were preparing for supervision but could be simpler for now as we have not yet required supervision as we are not yet exploring adversarial

Module Structure

I propose we extract each module like so:

`aggregator`

Concerned with aggregating public keys or plaintexts

PlaintextAggregator - Aggregate decrypted plaintext from decryption shares
PublickeyAggregator - Aggregate the publickey from broadcast keyshares

`core`

Define our core types in use throughout our app

EnclaveEvent - Domain events and their dependencies
E3id
EnclaveError
EventId
EventBus
OrderedSet - Utility that enables us to have an ordered list of keyshares so that the hash of our events remain consistent

`data`

Data writer actor for writing and reading data from the database
Data reader actor for reading data from the database
Sets up a write stream to efficiently stream writes to the database
Can be used by others as receivers in order to substitute database writes

`evm`

Concerned with interfacing between ciphernodes and the evm

Read and write data from the specific enclave evm smart contracts and publish those events to the event bus

`fhe`

Concerned with our homomorphic encryption schemes

Encapsulates all Encryption Scheme Specific stuff
Can act as an inflection point for other FHE encryption schemes currently only supports BFV

`router`

Concerned with the routing of events around our system particularly in regards to E3 requests.

E3RequestRouter
- Dynamically manages initialization and context based on in coming messages around E3
- Filters messages to dependencies based on e3_id
CiphernodeSelector
- Selects the ciphernode based on the sortition module
- Listens for E3Requested events - interacts with the sortition module and broadcasts CiphernodeSelected events

`sortition`

Concerned with determining if a particular node is found within a seeded list

Accumulates list events and responds to requests for inclusion in the set

`enclave`

Entry points for the app

preflight checks
bins with cli parsing and validation for configurations for the configurations in enclave_node

`enclave_node`

Library of entrypoint functions defined for different configuration for various binaries

ciphernode (currently main_ciphernode)
aggregator (currently main_aggregator)

`keyshare`

Concerned with secret key share operations. Uses fhe to do operations.

Keyshare creation
Decryption
Secret key encryption

`logger`

Manages log output

Actor to log stuff passed to the event bus
Can write this info to a log file via tracing crate etc.
Provide unified tracing tools
Eventually setup streaming log options

`p2p`

Concerned with peer to peer networking

P2p integration
Combine the current EnclaveRouter struct

`test_helpers`

Bins and functions that help in tests
Binaries that help us do things and integrate with various parts of the system
pack_e3_params (pack given params to bytes for an e3 request call)
mock_encrypt (encrypting some data to test decryption in the integration test)

`tests`

Fast business logic level tests to test:
- internal consistency of our system
- detailed edge cases
- event malfunctions

`zkp`

Concerned with proving systems we will be using

Other considerations

Actix deprecation

It was inconvenient to learn of Actix's pending deprecation. I don't consider this to be much of an issue but more of an opportunity to remove some of the issues with Actix (mainly poor async support). Actix provides a really concise syntax for handling messages and is a well known and relatively easy to learn API. Other systems like ractor require actors to disambiguate events manually which results in lots of boilerplate which I think it may be worth avoiding if it is not too difficult - it will also be time consuming to port. I have put together experiments which should manifest in a way to swap out Actix without changing much of the good parts of the API that Actix provides us (Handler syntax) as well as change the poor parts (async).

Feel free to have a look here https://github.com/ryardley/tactix

Can we remove actors from the system?

There is a consideration on what a system without actors looks like. We would then want to resort to using naked mspc channels in replacement of actors.

Because we are building a peer 2 peer system we cannot avoid creating an event driven system. Our events will necessarily comprise of the domain model for the system whether we choose to ignore it or not. Thus testing modules with events is mandatory and communicating between modules using events is strongly preferable if not just for testing than for flexibility.

This project is also a startup and we do not have the entire design fleshed out and we need to be able to iterate quickly and spin off combinations of modules to respond to market forces. So flexibility is key to success.

There are good arguments for considering removing the actor model:

Performance considerations - actors and message passing take up a small amount of overhead
Memory: sending lots of large messages between actors takes a large amount of memory especially as the messages we are sending are large.
We can attempt to write more sequential code instead of handling events
Errors might be able to handled in a more straightforward way

However I think what we would get would be these issues:

Poorer or more awkward testability
More highly coupled code
More complex ownership and borrowing handling
A generally more difficult to maintain codebase
More verbose code

Let's look at what a system without an actor model might look like:

Testing individual modules without an event bus would mean we would need to resort to dependency injection / abstract traits - possibly the use of channels to connect modules together as the prime interction. As we get a huge amount of benefit from using an event bus in terms of flexibility test-ability and modularity - it is not worth giving up for an application that has an event driven context. If we remove the actor model from the event bus we would probably want to consider rewriting it as a broadcast channel wrapped in a struct.
Our system would have more ad-hoc patterns as handling Rust's ownership semantics can become inconsistent.
Removing the Actor Model from an event driven system will result more matching / disambiguation of events between modules which will result in more verbose code. This may also be true if we migrate to ractor as that does not handle matching/disambiguation.
In Rust communicating by sharing memory such as a module or library involves utilizing Mutexes and Arcs and managing references which bring inconsistencies and can negatively impact the APIs that modules need to use to communicate.
In order to maintain module test-ability for things that are better modeled as sync calls we would need to resort to using injection, traits and generics - which gets complex fairly quickly with Rust.
Managing state across multiple threads and modules without actors requires to constantly think about synchronization locking and race conditions. Actors abstract away this complexity.
We need to be careful that we keep our approach to concurrency standardized if we choose to remove the actor system

Personally I don't find arguments for removing actors convincing due to having to awkwardly fit event driven patterns into a non event driven architecture as well as this there are a number of benefits we get especially for a specific situation as a technical startup with a senior team working on an event based distributed system.

Actors are a natural fit for distributed event driven systems
They represent a single easy to understand abstraction for message passing between modules.
There are common fault tolerance and isolation patterns we can take advantage of
Actors aid in testing as they act as a natural dependency injection / infection point see Recipient<M> in actix docs.
Actors are particularly ergonomic in Rust as they solve lots of problems with dealing with ownership and borrowing.
We need to stay flexible as we discover what the edges of this system is and respond to the needs of the market. Eg financial usecase - fine grained - subnets coordinating and collaborating. We don't know exactly how this will end.
Actors are scalable and we should be able to handle many concurrent processes on a single node we can work with libp2p to manage multiple sub systems should we require them say for a specific pubsub topic. We could farm out long running actors to their own process or even split work between servers. Probably wont need this but the opportunity is there and is open in the design space.
Actors help reasoning about complexity by isolating complex systems that can handle event propagation or streaming data such as hydrating keystores
As actors are location transparent we can move them between nodes and create roles or subnets within the network as required

The text was updated successfully, but these errors were encountered:

ryardley added enhancement New feature or request Ciphernode Related to the ciphernode package labels Oct 1, 2024

ryardley changed the title ~~Ciphernode Design Consolidation~~ Refactor to modules Oct 1, 2024

ryardley self-assigned this Oct 2, 2024

ryardley mentioned this issue Oct 2, 2024

Modularization #133

Merged

3 tasks

ryardley closed this as completed in #133 Oct 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor to modules #122

Refactor to modules #122

ryardley commented Oct 1, 2024 •

edited

Loading

Refactor to modules #122

Refactor to modules #122

Comments

ryardley commented Oct 1, 2024 • edited Loading

Red, Green, Refactor Concept

Good Features of the Current Design

Issues and next steps

Module Structure

aggregator

core

data

evm

fhe

router

sortition

enclave

enclave_node

keyshare

logger

p2p

test_helpers

tests

zkp

Other considerations

Actix deprecation

Can we remove actors from the system?

ryardley commented Oct 1, 2024 •

edited

Loading

`aggregator`

`core`

`data`

`evm`

`fhe`

`router`

`sortition`

`enclave`

`enclave_node`

`keyshare`

`logger`

`p2p`

`test_helpers`

`tests`

`zkp`