Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC Proposal #177

Open
mattiekat opened this issue Jan 12, 2022 · 7 comments · Fixed by #183 or #173
Open

RPC Proposal #177

mattiekat opened this issue Jan 12, 2022 · 7 comments · Fixed by #183 or #173
Labels
enhancement New feature or request

Comments

@mattiekat
Copy link
Contributor

mattiekat commented Jan 12, 2022

Goals

  1. Minimize boilerplate
    • Calling across RPC should be almost the same as calling local functions
    • If something will always be the same for every implementation, we should provide it, if it will usually be the same, we should consider including it as separate package
  2. Be wicked fast
    • minimize round trips
    • keep packet overhead small
    • use bebop for fast serialization 🚀
  3. KISS

Basic Features

Inspirations

  • JSON-RPC - Dead simple stateless RPC
  • gRPC - Feature rich, widely adopted, and moderately performant
  • Cap'n Proto - Cool, but they even have to describe levels of features because of how complicated it is
  • GraphQL - A way to make HTTP calls slower

Paradigm Decisions

(Bold items we plan to implement, non-bold items we considered).

  • Stateful OR stateless
  • Abstract transport protocol OR specific transport protocol - If we don't assume a given transport protocol we need to decide what features it should bring to the table (e.g. lossy/guaranteed, multichannel/singlechannel, encryption, compression, heartbeats, ...)
    • Reliable socket like transport
    • Encrypted
    • Not Ordered but we don't re-order for the user
    • Not compressed (we might want to support compression of messages)
    • Heartbeats/we know when disconnects happen
  • Static API OR Object oriented - Cap'n proto has a way of returning an "object" which can then be used for future method calls and they use it as a method of scoping as well (e.g. all auth requiring functions can only be reached after authenticating which returns an object). (Object in quotes because it could just be an ID value).

Possible Features:

(Bold items we plan to implement, non-bold items we considered).

  1. Function calls - basic takes parameters and has return
  2. Streamed return - takes parameters and then allows for an async stream of response objects
    • Second priority
    • Might not be useful at all since main RPC channel can coordinate separate sockets being connected for subscriptions
  3. Streamed args - one or more arguments are streamed and when done a single response is sent
  4. Call cancellation - e.g. being able to stop waiting for a response if it is taking too long and to notify the remote party
    • Required with stream responses
    • Maybe a nice to have for function calls?
  5. Server side call pipelining - cap'n proto's claim to fame which allows the caller to describe a series of RPC calls without 6. needing each response to return to them before the next request can be made
  6. Request batching - multiple RPCs in one operation
  7. Signature checking - ensuring that the caller and callee do not disagree on the protocol specs (signatures generated by types only, not names)
  8. Composable signatures - allows checking if a signature is compatible using XORs of sub-components in messages and unions.
  9. Version negotiation - attempting to run even if versions are different based on rules regarding compatibility
  10. Deadlines - allow the caller to specify a max amount time after which it will no longer care about a response
  11. P2P - allow both parties to make calls on the same transport channel

Schema

RPC introduces a new keyword service which allows defining a set of functions. Services have function definitions which express what data is expected and returned. They also define the opcode to function name mapping (names are used in the code, opcodes in the serialized form).

Restrictions:

  • The 0 opcode is reserved for string serviceName()
  • The 0xfff0-0xffffopcodes are reserved for future uses
  • Opcode is a uint16
  • A channel endpoint can only support one service, but multiple services can be defined in bebop. One channel can therefore have two services A←→B but each side can only call the one the opposing side is responding to.
  • Functions may only return one value
  • Function names must be unique

Definitions:

  • Functions have an opcode, name, arguments, and return values. Both arguments and return types automatically defines a struct. Generated code should strive to elide the argument/return wrapper structs as appropriate for the language.
  • Argument structs are automatically defined (see protocol)
  • Return structs are automatically defined (see protocol)
  • void is a special return type which indicates no bytes and declares an empty struct.
  • Functions which accept no arguments declare an empty struct.
// User types (here for example of generating signatures)
struct User {
  guid id;
  date joined;
  string first;
  string last;
}

message DoTheThing {
  1 -> uint32[] items;
  2 -> string thing;
}

union ThingsDone {
  1 -> struct { byte thing; }
  2 -> message { 4 -> uint64 other; }
}

// RPC definitions
service HelloService {
  /*
    Authenticate with the server. Stateful connection so no need to pass a token
    around.
  */
  1 -> void authenticate(string username, string password);
  // 2 perhaps was an old function that is no longer supported.
  /* Retrieve information about the current user. Requires being authenticated. */
  3 -> User getUserDetails();
  4 -> ThingsDone doTheThing(DoTheThing myThing, uint32 limit, string msg);
}

Protocol

The language-agnostic components which can be written in bebop.

Static Structures

These do not change based on the user schema and should be included in bebopc as a text string which can be generated on demand for the needed languages.

Headers

/* Static RPC request header used for all request datagrams. */
readonly struct RpcRequestHeader {
  /*
    Identification for the caller to identify responses to this request.
    
    The caller should ensure ids are always unique at the time of calling. Two active
    calls with the same id is undefined behavior. Re-using an id that is not currently
    in-flight is acceptable.

    These are unique per connection.
  */
  uint16 id;

  /*
    How long in seconds this request is allowed to be processed for. A value of 0
    indicates no limit.
    
    This allows for potentially long queries to be cancelled if the requester will no
    longer be interested and more importantly allows unreliable transports to establish
    an agreed upon point at which the requester is going to assume the packet was lost
    even if it just had yet to be sent.
    
    By using a max-time to compute rather than an expiration time, we reduce the risk
    of different system times causing confusion. Though there will be some overlap
    where a response may be sent and ignored as the requester already considers it
    to be expired. 
  */
  uint16 timeout;

  /*
    Function signature includes information about the args to ensure the caller and
    callee are referencing precisely the same thing. There is a non-zero risk of
    accidental signature collisions, but 32-bits is probably sufficient for peace of
    mind.
    
    I did some math, about a 26% chance of collision using 16-bits assuming 200 unique
    RPC calls which is pretty high, or <0.0005% chance with 32-bits.
  */
  uint32 signature;
}
/* Static RPC response header used for all response datagrams. */
readonly struct RpcResponseHeader {
  /* The caller-assigned identifier */
	uint16 id;
}

Null Service

This should be hardcoded and not be generated from static bebop, see Implementation.

Datagram

/*
  All data sent over the transport MUST be represented by this union.
  
  Note that data is sent as binary arrays to avoid depending on the generated structure
  definitions that we cannot know in this context. Ultimately the service will be
  responsible for determining how to interpret the data.
*/
union RpcDatagram {
  1 -> struct RpcRequestDatagram {
    RpcRequestHeader header;
    /* The function that is to be called. */
    uint16 opcode;
    /* Callee can decode this given the opcode in the header. */
    byte[] request;
  }
  2 -> struct RpcResponseOk {
    RpcResponseHeader header;
    /* Caller can decode this given the id in the header. */
    byte[] data;
  }
  3 -> struct RpcResponseErr {
    RpcResponseHeader header;
    /*
      User is responsible for defining what code values mean. These codes denote
      errors that happen only once user code is being executed and are specific
      to each domain.
    */
    uint32 code;
    /* An empty string is acceptable */
    string info;
  }
  /* Default response if no handler was registered. */
  0xfc -> struct CallNotSupported {
    RpcResponseHeader header;
  }
  /* Function id was unknown. */
  0xfd -> struct RpcResponseUnknownCall {
    RpcResponseHeader header;
  }
  /* The remote function signature did not agree with the expected signature. */
  0xfe -> struct RpcResponseInvalidSignature {
    RpcResponseHeader header;
    /* The remote function signature */
    uint32 signature;
  }
  /*
    A message received by the other end was unintelligible. This indicates a
    fundamental flaw with our encoding and possible bebop version mismatch.

    This should never occur between proper implementations of the same version.
  */
  0xff -> struct DecodeError {
    /* Information provided on a best-effort basis. */
    RpcResponseHeader header;
    string info;
  }
}

Dynamic Structures

These are not actually written in bebop syntax anywhere, however, the AST structure would be generated and used to produce the language-appropriate structures. So for readability they are presented here as bebop.

It is preferable to avoid using a union here because it would create either another layer of size and opcode on top of what is already present OR it would create a dependency between the static definition and the definitions for each and every service leading to significantly more code generation.

Return structs don't strictly add a ton of value, but they make the code more symmetric and a little easier for the generator to spit out in theory. We may opt to remove them later. The same goes for empty argument structs with only zero or one items.

// md5("(string,string)(void)") = b51ddc223579c70014a1e23c98329b08
const uint32 _HelloServiceAuthenticateSignature = 0xb51ddc22;
// md5("(void)((guid,date,string,string))") = cc627ee5abfcca0211acdf9716a70854
const uint32 _HelloServiceGetUserDetailsSignature = 0xcc627ee5;
// md5("(message{1:uint32[],2:string},uint32,string)(union{1:(byte),2:message{4:uint64}},uint64)") = 0836646b276d1768e0924c99dcdaca78
const uint32 _HelloServiceDoTheThingSignature = 0x0836646b;

struct _HelloServiceAuthenticateArgs {
  string username;
  string password;
}

struct _HelloServiceAuthenticateReturn {}

struct _HelloServiceGetUserDetailsArgs {}

struct _HelloServiceGetUserDetailsReturn {
  User value;
}

struct _HelloServiceDoTheThingArgs {
  DoTheThing myThing;
  uint32 limit;
  string msg;
}

struct _HelloServiceDoTheThingReturn {
  ThingsDone value;
}

Standard Components

These can be hardcoded since they don’t change or have special code written to make them semi-dynamic and are the same for all services. This needs to be the same signature for all to allow cross-service name checks. This is the only function which should be guaranteed to work no matter what service is on the other end.

// md5("(void)(string)") = 1bf832690ab97e3599a46d2b08739140
const uint32 _HelloServiceNameSignature = 0x1bf83269;

// optionally could leave this struct out since it is a special case anyway
struct _HelloServiceNameArgs {}

struct _HelloServiceNameReturn {
  string serviceName;
}

Signatures

Signatures are to to ensure the binary data sent between the peers is interpretable. It should not therefore include things which are changeable without altering the binary representation if at all possible.

We already ensure the sizes line up with what is expected; this is a good check for ensuring we don't end up reading invalid memory, but it is not enough to catch many possible errors. Using signatures means we can throw a signature error rather than a generic decode error. They are designed to catch human mistakes and prevent possible data corruption in a database. Signatures are not needed to ensure the protocol itself remains safe.

There are cases where we would have been able to correctly read the data even if the signature does not match such as if a new field is added to a union or a message since bebop is required to be forward compatible for both. There is a solution to this using composable hashes, but it is a feature we will need to revisit later if we decide it is needed.

The above example strings above are probably not what we will end up generating but do give the idea of what needs to be captured and one way it could be done. We may also want to include these raw signature strings as constants so they can be included in error messages.

Implementation

The language dependent components which connect with the generated code and make it function.

The following pseudocode is written in Rust as it is the most expressive of the languages we are using. The real implementation will be somewhat different as things come up. This is a template which hopes to set forth the logical structures and to pin down naming choices.

The bebop structs mentioned in the protocol section are assumed to be generated and will not be re-listed here.

Static Runtime

Building Blocks

/// Transport protocol has a few main responsibilities:
/// 1. interpreting the raw stream as datagrams
/// 2. automatically reconnecting and dealing with network issues
/// 3. deciding how it wants to handle recv futures
trait TransportProtocol {
  fn set_handler(&mut self, recv: fn(datagram: RpcDatagram) -> Future<Output=()>);
  async fn send(&self, datagram: &RpcDatagram) -> Result<(), TransportError>;

  async fn send_decode_error_response(&self, call_id: u16, info: Option<&str>) -> TransportResult {
    // ...
  }
}

/// The local end of the pipe handles messages.
/// Implementations are generated from bebop service definitions.
trait ServiceHandlers {
  /// Use opcode to determine which function to call, whether the signature matches,
  /// how to read the buffer, and then convert the returned values and send them as a
  /// response
  async fn _recv_call(&self, opcode: u16, sig: u32, call_id: u16, buf: &[u8]);
}

/// Wrappers around the process of calling remote functions.
/// Implementations are generated from bebop service definitions.
trait ServiceRequests {
  const NAME: &'static str;
}

Router

/// This is the main structure which represents information about both ends of the
/// connection and maintains the needed state to make and receive calls. This
/// is the only struct of which an instance should need to be maintaned by the user.
struct Router<P: TransportProtocol, L: ServiceHandlers, R: ServiceRequests> {
  /// Underlying transport
  transport: P,
  /// Local service handles requests from the remote.
  local_service: L,
  /// Remote service converts requests from us, so this also provides the callable RPC
  /// functions.
  remote_service: R,
}

/// Allows passthrough of function calls to the remote
impl Deref for Router {
    fn deref(&self) -> &R {}
}

impl Router {
  fn new(...) -> Self { ... }
  
  /// Receive a datagram and routes it
  async fn _recv(&self, datagram: RpcDatagram) {
    self.local_service._recv_call(...).await
  }

  /// Send a request
  async fn _send_request(&self, call_id: u16, buf: &[u8]) -> TransportResult {}

  /// Send a response to a call
  async fn _send_response(&self, call_id: u16, data: &[u8]) -> TransportResult {}

  async fn _send_error_response(&self, call_id: u16, code: u32, msg: Option<&str>) -> TransportResult {}
  async fn _send_unknown_call_response(&self, call_id: u16) -> TransportResult {}
  async fn _send_invalid_sig_response(&self, call_id: u16, expected_sig: u32) -> TransportResult {}
  async fn _send_call_not_supported_response(&self, call_id: u16) -> TransportResult {}
  async fn _send_decode_error_response(&self, call_id: u16, info: Option<&str>) -> TransportResult {
    self.transport.send_decode_error_response(...).await
  }
}

Error Handling

enum TransportError {
  // Things that could go wrong with the underlying transport, need it to be
  // somewhat generic. Things like the internet connection dying would fall
  // under this.
}

type TransportResult = Result<(), TransportError>;

/// Errors that the local may return when responding to a request.
enum LocalRpcError {
  CustomError(u32, String),
  CustomErrorStatic(u32, &'static str),
  NotSupported,
}

/// Response type that is returned locally and will be sent to the remote.
type LocalRpcResponse<T> = Result<T, LocalRpcError>;

/// Errors that can be received when making a request of the remote.
enum RemoteRpcError {
  TransportError(TransportError),
  CustomError(u32, Option<String>),
  NotSupported,
  UnknownCall,
  /// When the received datagram has a union branch we don't know about.
  UnknownResponse,
  InvalidSignature(u32),
  CallNotSupported,
  DecodeError(Option<String>)
}

/// A response on the channel from the remote.
type RemoteRpcResponse<T> = Result<T, RemoteRpcError>;

Implementations for Static Parts

These may be worth hardcoding rather than trying to implement by generating from static bebop.

/// A service used when one end of the channel does not offer any callable endpoints.
/// You can also use a NullService for a remote which is any other service and it will
/// mask it, making it impossible to call, but also not causing any errors.
struct NullService;

impl ServiceHandlers for NullService {
  async fn service_name(&self) -> LocalRpcResponse<&str> { Ok("NullService") }
}

impl ServiceRequests for NullService {
  async fn service_name(&self) -> RemoteRpcResponse<()> { Ok("NullService") }
}

Generated Code

This generated code is separate from what is currently created by bebop and cannot be made simply by leveraging the AST. In its implementation, it will reference types that were generated more classically and are described in protocol.

Service Definitions

/// The local handlers for the service
trait HelloServiceHandlers {
  async fn service_name(&self) -> LocalRpcResponse<&str> { Ok("HelloService") }
  async fn authenticate(&self, username: &str, password: &str) -> LocalRpcResponse<()> { Err(NotSupported) }
  async fn get_user_details(&self) -> LocalRpcResponse<User> { Err(NotSupported) }
  async fn do_the_thing(&self, arg1: DoTheThing, arg2: u32, arg3: &str) -> LocalRpcResponse<(ThingsDone, u64)> { Err(NotSupported) }
}

impl SerivceHandlers for HelloServiceHandlers {
  const NAME: &'static str = "HelloService";

  fn _recv_call(&self, opcode: u16, sig: u32, call_id: u16, buf: &[u8]) {
    /* generated routing and stuff */
  }
}

/// Wrapper around the remote functions we can call.
struct HelloServiceRequests;
impl HelloServiceRequests {
  async fn service_name(&self) -> RemoteRpcResponse<()> {}
  async fn authenticate(&self, username: &str, password: &str) -> RemoteRpcResponse<()> {}
  async fn get_user_details(&self) -> RemoteRpcResponse<User> {}
  async fn do_the_thing(&self, myThing: DoTheThing, limit: u32, msg: &str) -> RemoteRpcResponse<ThingsDone> {}
}

impl ServiceRequests for HelloServiceRequests {
  const NAME: &'static str = "HelloService";
}

User Code

  1. Define the local service implementation
  2. Define/import transport implementation
  3. Create tokio runtime
  4. Create a Router instance + transport + local service
  5. Begin making calls
/// This is the user's service and it may contain state. They are then able to implement
/// all of the handlers however they want.
struct HelloService;
impl HelloServiceHandlers for HelloService {
  // ..
}

struct WebsocketTransport {
  // magic for now
}

impl TransportProtocol for WebsocketTransport {
  // ...
}

#[tokio::main]
fn main() {
  // make a router for the "client" which can call HelloService but does not accept any
	// calls from the remote endpoint.
  let router = Router::new(WebsocketTransport::new(), NullService, HelloService);
  
  // deref is implemented, so we can simply call its functions 
  assert_eq!(router.service_name().await, "HelloService");
  
  router.authenticate("someperson", "somepassword").await.unwrap();
  println!(router.get_user_details().await);
  let (done, time) = router.do_the_thing(DoTheThing {}, 1234, "blah").await.unwrap();
  // ...
}
@mattiekat mattiekat added the enhancement New feature or request label Jan 12, 2022
@mattiekat mattiekat self-assigned this Jan 12, 2022
This was referenced Jan 12, 2022
@mattiekat
Copy link
Contributor Author

Posting this after internal discussion to open it for community comment. We have begun making progress on #173.

@andrewmd5
Copy link
Contributor

I am not a big fan of void is a special argument/return type which indicates no bytes and declares an empty struct. A function which accepts no arguments must specify void without an identifier.

As a developer if my method doesn't require arguments I would simply write getUserDetails()

@mattiekat
Copy link
Contributor Author

I am not a big fan of void is a special argument/return type which indicates no bytes and declares an empty struct. A function which accepts no arguments must specify void without an identifier.

As a developer if my method doesn't require arguments I would simply write getUserDetails()

Proposal updated

@mattiekat
Copy link
Contributor Author

mattiekat commented Jan 26, 2022

Tracking of progress available here: https://github.com/RainwayApp/bebop/projects/2

@mattiekat
Copy link
Contributor Author

From standup:

  • Support synchronous handlers via an attribute flag or by declaring sync/async with a keyword
  • Support the sender asking for a response OR saying it does not expect one (maybe ID 0 == no response)
  • Integration tests using RainwaySDK as the transport

This was linked to pull requests Mar 1, 2022
@bengreenier bengreenier removed their assignment Mar 1, 2022
@mattiekat
Copy link
Contributor Author

Added a wiki page to be an introduction
https://github.com/RainwayApp/bebop/wiki/RPC

@mattiekat mattiekat removed their assignment May 6, 2022
@sgf
Copy link

sgf commented Aug 29, 2022

maybe (Publisher/Subscriber and Callee/Caller) support ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants