Skip to content

Latest commit

 

History

History
61 lines (46 loc) · 4.35 KB

README.md

File metadata and controls

61 lines (46 loc) · 4.35 KB

protobuf-playground

A collection of experiments with Protobuf.

Here's a brief description of the sub-modules.

  1. Module parent is a common place for managing dependencies and plugins.

  2. Module domain-definition contains the protobuf definition of our sample domain model (bank.proto). It produces a jar that contains the proto file and its compiled binary descriptor (bank.protobin). All other modules depend on it.

  3. Module domain-with-converters defines our domain model classes that correspond to the proto file and also generates domain model and marshalling code using google's tools (protoc). Then a hand-coded conversion layer converts between the generated classes and our classes back and forth so your app will be able to use your own domain instead of the protobuf generated code.

    This is an attempt to prove how cumbersome is to use your own domain model classes with google's library. If you can live with the code generated by protobuf's tools then by all means, use it directly! But do not try to impose your own domain classes on top of it because it's going to be very boilerplate and inefficient and you're going to pollute your application with an extra set of classes.

  4. Module domain-with-interfaces attempts to define a domain model that consists of a set of user provided classes that implement interfaces automatically generated by a tool based on the protobuf binary descriptor. The tool also generates the marshalling code that interacts with the model via the said interfaces.

    This leaves the user the flexibility of providing their own domain and not having to worry about marshalling code (it's generated). But the user is still forced to implement some generated interfaces that it cannot control directly and the code generation is an extra step at build time, a drawback that we would like to avoid if possible.

    Still, speed-wise and memory-wise, this approach is the best, on par with google's generated code.

  5. Module protostream provides a generic marshaller that uses the protobuf definition at runtime rather than relying on statically generated marshalling code. The user is free to bring his own domain model to the party but it has to create and register MessageMarshaller implementations for each object. The MessageMarshaller interacts with a reader/writer object that looks much like java's DataOutput/DataInput interfaces and that's why this approach is called protostream.

    If we expand it a bit by providing and automatically registering a generic MessageMarshaller for google generated classes (that implement com.google.protobuf.Message or com.google.protobuf.MessageLite) we could even make our marshaller able to accept google generated classes directly so users that need for some reason to mix this marshalling approach with google's protoc generated classes can do so without any extra effort.

    The whole parsing process is based on pull. Given that the ordering of fields in a proto stream is not predictable (by protobuf design) and cannot be enforced (because it would mean breaking the protocol) we need to do all sorts of stream (unbounded!) look-ahead tricks (and create intermediate objects for holding the skipped data) which are inherently less efficient than the code generated by google or our CodeGen from module domain-with-interfaces. How much inefficient, we do not know yet. But since we are going to parse messages that are entirely in memory already the look-ahead might not have much impact. Need to measure this.

    So far this approach (plan A) is the most user friendly and we're going to stick with it and if efficiency proves to not be good enough then we can try plan B, detailed below.

    This module also contains a sketch of an alternative approach (see MessageFieldAccessor) were parsing instead of being pull-based it is push-based (or event-driven). Think SAX vs DOM. This alternative is much like SAX and does not have the look-ahead inefficiency and does not require creation of intermediate objects as mentioned earlier. It can also rely on annotations instead of hand-written code, or both could coexist. And we can go even further and both plan A and B could coexist.

  6. Module roundtrip-test tests the binary compatibility of the various marshalling approaches demonstrated in the other modules. Not complete yet.