-
Notifications
You must be signed in to change notification settings - Fork 0
A C++ serializer generator with serialization of cycles.
License
Helios-vmg/cppserialization
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
LAS[1] is a code generator that accepts a description of one or more data types and generates C++ code that automatically handles the de/serialization. The input syntax is reminiscent of C++'s class definition declarative sublanguage, so it should be intuitive to any C++ programmer. Advantages: * Simple maintainability. * Reference cycles of unlimited depth. * Space-efficiency. * Designed to operate with non-seekable streams. Limitations: * No built-in version support. * The graph traversal algorithm can only understand pointer graphs where all the pointers point to the proper start of an object. If any pointers point to the middle of an object, the behavior is undefined. * Objects must always be deserialized in full. It's not possible to lazily deserialize an object. Comparison with other serialization codebases boost::serialization[2] boost::serialization is a helper library, not a code generator. The burden of writing and maintaining the serialization code goes upon the user of the library. If the data structure to be serialized changes, the programmer must change the serialization code accordingly. LAS, conversely, automatically generates the serialization code, outright eliminating a whole class of potential bugs. Google Protobuf[3] Protobuf generates the serialization code like LAS. Protobuf is designed primarily for RPC and network protocols; it's data model cannot represent object graphs of any kind, it can only represent arbitrary length collections inside an object. Figure 1: Root = Object A Object A -> [Object B, Object C] Object B -> [Object D] Object C -> [Object D] For example, if it's desirable to serialize an object graph such as the one in figure 1 using Protobuf while preserving the object relationships, the programmer would first have to manually transform the in-memory graph into something like in figure 2. Figure 2: (JSON) Root = { 1: { "name": "Object A", "children": [2, 3] }, 2: { "name": "Object B", "children": [4] }, 3: { "name": "Object C", "children": [4] } 4: { "name": "Object D", "children": [] } } That is, the root would become an associative array. The programmer would have to write the code that maps a memory address to a position in the array. LAS includes this code as part of the run-time library. Another characteristic derived from its design as a protocol definition language is that Protobuf often requires that in-memory objects be temporarily converted into the types generated by it prior to serialization, and then converted back after the message is deserialized. LAS on the other hand is designed to generate classes that may be used both during serialization and throughout the program. Protobuf has built-in support for message versioning, allowing the design of protocols with forwards and backwards compatibility. LAS allows designing class hierarchies that support versioning with backwards compatibility, but the programmer must write the versioning support themselves. Forwards compatibility is not possible. Protobuf is much more mature than LAS, and possibly faster, but also much larger and complex. Apache Avro[4] Avro shares, AFAICT, the same differences to LAS as Protobuf. Cap'n Proto[5] Cap'n Proto is designed by a former developer of Protobuf. Like Protobuf, it is designed around RPC protocols, and therefore includes some things not directly related to serialization. Most of the things I've said about Protobuf can be said about Cap'n Proto. Cap'n Proto defines not just the serialized format, but also the in-memory representation of objects. Serialization basically involves dumping the memory of the object to a stream. LAS must traverse objects and serialize each member individually. Because the in-memory representation and the serialized representation are the same, Cap'n Proto is subject to some artificial restrictions to prevent vulnerabilities to certain kinds of attacks delivered through maliciously-crafted messages. For example, object graphs are limited to some arbitrary depth to prevent stack overflows. LAS uses a bounded number of stack frames both when serializing and when deserializing, so it's able to process any possible object graph. [1] Less-Ambitious Serializer. I originally wanted to use libclang to parse C++ and generate the serialization code directly from that, but I scaled back to just processing an input text file. [2] http://www.boost.org/doc/libs/1_60_0/libs/serialization/ [3] https://developers.google.com/protocol-buffers/ [4] https://avro.apache.org/ [5] https://capnproto.org/
About
A C++ serializer generator with serialization of cycles.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published