Hpp-proto is a lightweight, high-performance Protocol Buffers implementation in C++20. It maps Protocol Buffers messages directly to simple C++ aggregates, using only C++ built-in or standard library types. Apart from UTF-8 validation, the serialization code for these mapped aggregates is entirely header-only, ensuring minimal dependencies and efficient performance.
Compared to Google’s implementation, hpp-proto adopts a minimalistic design that greatly reduces code size while offering superior performance in benchmarks where runtime reflection is unnecessary. Additionally, hpp-proto supports a non-owning mode, mapping all variable-length fields to lightweight types such as std::string_view or equality_comparable_span. This mode allows users to customize memory management during deserialization, further enhancing efficiency. These features make hpp-proto an excellent choice for performance-critical, real-time, or resource-constrained environments.
- Supports Protocol Buffers syntax 2 and 3 and editions.
- Supports the serialization of ProtoJSON format, utilizing a slightly modified version of the glaze library.
- Significantly smaller code size compared to Google's C++ implementation.
- Faster performance than Google's C++ implementation.
- Maps all Protocol Buffers message definitions to simple C++ aggregates using standard C++ library types.
- Aside from UTF-8 validation, all generated code and the core library are header-only.
- Each generated C++ aggregate is associated with static C++ reflection data for efficient Protocol Buffers encoding and decoding.
- All generated message types are equality-comparable, making them useful in unit testing.
- Completely exception-free.
- Supports non-owning mode code generation, mapping string and repeated fields to
std::string_view
andhpp::proto::equality_comparable_span
which derives fromstd::span
and adds the equality comparator. - Enables compile-time serialization.
- Lacks runtime reflection support.
- Lacks support for extra json print options like
always_print_fields_with_no_presence
,always_print_enums_as_ints
,preserve_proto_field_names
orunquote_int64_if_possible
. - Unknown fields are always discarded during deserialization.
Platform | Mac | Linux |
---|---|---|
OS | MacOS 14.7 | Ubuntu 22.04 |
CPU | M1 Pro/MK193LL/A | Intel Core i9-11950H @ 2.60GHz |
Compiler | Apple clang 16.0.0 | gcc 12.3.0 |
Google protobuf version 28.3
We measured the runtime performance using the dataset and the benchmarks.proto definition from Google Protocol Buffers version 3.6.0. The benchmarks focus on three core operations: deserialization, setting a message (set_message), and setting a message combined with serialization (set_message and serialize). The performance was evaluated on two implementations: Google Protocol Buffers and hpp-proto, with regular and arena/non-owning modes being tested for each operation.
Operations CPU time on Mac | ||||||
---|---|---|---|---|---|---|
deserialize | set_message | set_message and serialize | ||||
regular | arena/non_owning | regular | arena/non_owning | regular | arena/non_owning | |
google CPU time | 475.0 ns |
366.0 ns |
382.0 ns |
268.0 ns |
509.0 ns |
426.0 ns |
hpp_proto CPU time | 283.0 ns |
170.0 ns |
81.0 ns |
8.38 ns |
285.0 ns |
182.0 ns |
hpp_proto speedup factor | 1.67 |
2.15 |
4.71 |
31.98 |
1.78 |
2.34 |
Operations CPU time on Linux | ||||||
---|---|---|---|---|---|---|
deserialize | set_message | set_message and serialize | ||||
regular | arena/non_owning | regular | arena/non_owning | regular | arena/non_owning | |
google CPU time | 250.0 ns |
257.0 ns |
114.0 ns |
108.0 ns |
225.0 ns |
229.0 ns |
hpp_proto CPU time | 198.0 ns |
142.0 ns |
34.6 ns |
10.9 ns |
146.0 ns |
116.0 ns |
hpp_proto speedup factor | 1.26 |
1.81 |
3.29 |
9.91 |
1.54 |
1.97 |
The performance benchmarks clearly demonstrate the overall efficiency of the hpp_proto library compared to Google’s implementation across deserialization, setting a message, and setting a message with serialization operations. However, for the serialization operation alone, Google’s implementation may be faster than hpp-proto. This comes at the expense of the set_message operation, where hpp_proto significantly outperforms Google’s implementation.
It’s important to note that in real-world applications, setting a message is a prerequisite before serialization can take place. While Google’s implementation may offer faster serialization times, the combined time required for both setting a message and serializing shows that hpp_proto delivers better performance overall.
Benchmark code is available here
We compared the code sizes of three equivalent programs: hpp_proto_decode_encode, google_decode_encode and google_decode_encode_lite. These programs are responsible for decoding and encoding messages defined in benchmarks.proto, using the hpp-proto and Google Protocol Buffers implementations. The google_decode_encode program is statically linked with libprotobuf, while google_decode_encode_lite is linked with libprotobuf-lite.
Code size in bytes | ||
---|---|---|
Mac | Linux | |
google_decode_encode | 2624344 |
3410088 |
google_decode_encode_lite | 1106408 |
1474208 |
hpp_proto_decode_encoded | 121208 |
92520 |
The comparison highlights a significant reduction in code size when using hpp-proto compared to Google’s Protocol Buffers implementations. On macOS, hpp-proto offers a 21.65x reduction in size compared to google_decode_encode and a 9.13x reduction compared to google_decode_encode_lite. The reduction is even more pronounced on Linux, where hpp-proto reduces the code size by 36.86x compared to google_decode_encode and by 15.93x compared to google_decode_encode_lite.
This section provides a quick introduction to the basic usage of hpp-proto to help you get started with minimal setup. It covers the essential steps required to integrate hpp-proto into your project and begin working with Protocol Buffers. For more advanced usage scenarios, optimizations, and additional features, please refer to the detailed examples in the tutorial directory and code generation guide of the repository.
If you haven’t installed the protoc
compiler, download the package and follow the instructions in the README.
The hpp-proto library can be directly installed locally then use cmake find_package
to solve the dependency, or it can be used via cmake FetchContent
mechanism.
git clone https://github.com/huangminghuang/hpp-proto.git
cd hpp-proto
# use installed protoc by default or specify '-DHPP_PROTO_PROTOC=compile' to download google protobuf and compile protoc from source
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=$HOME/local -Bbuild -S .
cmake --build build --target install
// addressbook.proto
syntax = "proto3";
package tutorial;
message Person {
string name = 1;
int32 id = 2;
string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
string number = 1;
PhoneType type = 2;
}
repeated PhoneNumber phones = 4;
}
message AddressBook {
repeated Person people = 1;
}
export PATH=$PATH:/path/to/protoc-gen-hpp
protoc -I=$SRC_DIR --hpp_out=$DST_DIR $SRC_DIR/addressbook.proto
This generates the following files in your specified destination directory:
addressbook.msg.hpp
, the header which declares your generated messages.addressbook.pb.hpp
, which contains the overloaded functions required for protobuf decoding/encoding.addressbook.glz.hpp
, which contains the template specializations for JSON decoding/encoding.addressbook.desc.hpp
, which contains the protobuf descriptors of the generated messages.
find_package
cmake_minimum_required(VERSION 3.24)
project(hpp_proto_tutorial
VERSION 1.0.0
LANGUAGES CXX)
find_package(hpp_proto CONFIG REQUIRED)
add_library(addressbook_lib INTERFACE addressbook.proto)
target_include_directories(addressbook_lib INTERFACE ${CMAKE_CURRENT_BINARY_DIR})
protobuf_generate_hpp(TARGET addressbook_lib
# uncomment the next line for non-owning mode
# PLUGIN_OPTIONS non_owning
)
add_executable(tutorial_proto addressbook.cpp)
target_link_libraries(tutorial_proto PRIVATE addressbook_lib)
FetchContent
cmake_minimum_required(VERSION 3.24)
project(hpp_proto_tutorial
VERSION 1.0.0
LANGUAGES CXX)
include(FetchContent)
FetchContent_Declare(
hpp_proto
GIT_REPOSITORY https://github.com/huangminghuang/hpp-proto.git
GIT_TAG main
GIT_SHALLOW TRUE
)
FetchContent_MakeAvailable(hpp_proto)
add_library(addressbook_lib INTERFACE addressbook.proto)
target_include_directories(addressbook_lib INTERFACE ${CMAKE_CURRENT_BINARY_DIR})
protobuf_generate_hpp(TARGET addressbook_lib
# uncomment the next line for non-owning mode
# PLUGIN_OPTIONS non_owning
)
add_executable(tutorial_proto addressbook.cpp)
target_link_libraries(tutorial_proto PRIVATE addressbook_lib)
The mapping from proto messages to their C++ message types is straight forward, as shown in the following generated code.
Notice that the *.msg.hpp
only contains the minimum message definitions and avoid the inclusion of headers related to the protobuf/JSON
encoding/decoding facilities. This makes those generated structures easier to be used as basic vocabulary types among modules without incurring unnecessary dependencies.
Below are the examples of the generated code for the addressbook.proto
file in regular and non-owning modes.
Regular Mode
// addressbook.msg.hpp
namespace tutorial {
using namespace hpp::proto::literals;
struct Person {
enum class PhoneType {
MOBILE = 0,
HOME = 1,
WORK = 2
};
struct PhoneNumber {
std::string number = {};
PhoneType type = PhoneType::MOBILE;
bool operator == (const PhoneNumber&) const = default;
};
std::string name = {};
int32_t id = {};
std::string email = {};
std::vector<PhoneNumber> phones;
bool operator == (const Person&) const = default;
};
struct AddressBook {
std::vector<Person> people;
bool operator == (const AddressBook&) const = default;
};
}
// addressbook.pb.hpp
#include "addressbook.msg.hpp"
namespace tutorial {
auto pb_meta(const Person &) -> std::tuple<...> ;
auto pb_meta(const Person::PhoneNumber &) -> std::tuple<...> ;
auto pb_meta(const AddressBook &) -> std::tuple<...>;
}
Non-owning Mode
// addressbook.msg.hpp
namespace tutorial {
using namespace hpp::proto::literals;
struct Person {
enum class PhoneType {
MOBILE = 0,
HOME = 1,
WORK = 2
};
struct PhoneNumber {
std::string_view number = {};
PhoneType type = PhoneType::MOBILE;
bool operator == (const PhoneNumber&) const = default;
};
std::string_view name = {};
int32_t id = {};
std::string_view email = {};
hpp::proto::equality_comparable_span<const PhoneNumber> phones;
bool operator == (const Person&) const = default;
};
struct AddressBook {
hpp::proto::equality_comparable_span<const Person> people;
bool operator == (const AddressBook&) const = default;
};
}
// addressbook.pb.hpp
#include "addressbook.msg.hpp"
namespace tutorial {
auto pb_meta(const Person &) -> std::tuple<...> ;
auto pb_meta(const Person::PhoneNumber &) -> std::tuple<...> ;
auto pb_meta(const AddressBook &) -> std::tuple<...>;
}
The hpp-proto library provides an efficient and convenient interface for encoding and decoding Protobuf messages in C++. The two core functions are:
write_proto()
: Serializes a generated C++ message object into the binary Protobuf format.read_proto()
: Deserializes a binary Protobuf-encoded buffer back into the corresponding C++ message object.
These APIs offer flexible usage with overloads for returning either a status or an std::expected object (containing the result or an error). Below are demonstrations of their usage in both regular and non-owning modes.
Regular mode APIs
#include <addressbook.pb.hpp> // Include "*.pb.hpp" for Protobuf APIs
// ....
tutorial::Person in_msg, out_msg;
msg1.name = "john";
std::string out_buffer;
using namespace hpp::proto;
// Serialize using the status return API
if (!write_proto(in_msg, out_buffer).ok()) {
// Handle error
}
assert(out_buffer == "\x0a\x04john");
// Serialize using the expected return API
expected<std::string, std::errc> write_result = write_proto<std::string>(in_msg);
assert(write_result.value() == "\x0a\x04john");
std::string_view in_buffer = "\x0a\x04john";
// Deserialize using the status return API
if (!read_proto(out_msg, in_buffer).ok()) {
// Handle error
}
assert(out_msg.name == "john");
// Deserialize using the expected return API
expected<Person, std::errc> read_result = read_proto<Person>(in_msg);
assert(read_result.value().name == "john");
Non-owning mode APIs
In non-owning mode, variable-length fields in messages are mapped to lightweight types such as std::string_view
or equality_comparable_span
. Instead of copying values, non-owning messages provide views to the original data, requiring careful lifetime management of referenced memory to avoid invalid access.
write_proto()
: No difference between regular and non-owning modes.read_proto()
: Requires an option object containing a memory resource for managing allocations.
The memory resource must have an allocate()
member function equivalent to std::pmr::memory_resource::allocate, ensuring it returns at least the requested amount of memory and never indicates an error by returning nullptr
. Errors should only be communicated through exceptions.
Additionally, all allocated memory must be properly released when the memory resource is destroyed.
For most use cases, std::pmr::monotonic_buffer_resource is recommended.
Two option objects allow you to specify memory resources for read_proto()
:
alloc_from
: Permits the deserialized value to reference memory within the input buffer.strictly_alloc_from
: Ensures that all memory used by the deserialized value is allocated from the provided memory resource.
Below is an example demonstrating the differences between these two options:
#include <addressbook.pb.hpp> // Include "*.pb.hpp" for Protobuf APIs
// ....
std::string_view in_buffer = "\x0a\x04john";
std::array<char, 16> arena;
std::pmr::monotonic_buffer_resource pool{arena.data(), arena.size()};
tutorial::Person out_msg;
using namespace hpp::proto;
// Deserialization using alloc_from
if (!read_proto(out_msg, in_buffer, alloc_from{pool}).ok()) {
// Handle error
}
assert(out_msg.name == "john");
// out_msg.name references in_buffer
assert(out_msg.name.data() == in_buffer.data() + 2);
// Deserialize using strictly_alloc_from
if(!read_proto(out_msg, in_buffer, strictly_alloc_from{pool}).ok()) {
// Handle error
}
assert(out_msg.name == "john");
// out_msg.name now references arena
assert(out_msg.name.data() == arena.data());
The hpp-proto library also supports encoding and decoding the C++ message objects to and from canonical JSON encoding using the modified (glaze)[https://github.com/stephenberry/glaze] library. This ensures compatibility with the canonical JSON encoding of Protobuf messages. The key functions are:
write_json()
: Serialize a C++ message object into a JSON string.read_json()
: Deserialize a JSON string back into the corresponding C++ message object.
Similar to Protobuf APIs, the JSON APIs provide overloads that return either a status or an expected object.
In addition, write_json()
can take an additional indent
object for pretty printing.
Below is a demonstration of how to use these functions for encoding and decoding in regular and non-owning modes.
Regular Mode
#include "addressbook.glz.hpp" // Include the "*.glz.hpp" for JSON APIs
// ....
std::string out_json;
tutorial::Person in_msg;
in_msg.name = "john";
using namespace hpp::proto;
// Serialize using the status return API
if (!write_json(in_msg, out_json).ok()) {
// Handle error
}
// Serialize using the expected return API
auto write_result = write_json(in_msg);
assert(write_result.value() == out_json);
// Pretty printing with 3 spaces indent
if (!write_json(in_msg, out_json, indent<3>{}).ok()) {
// Handle error
}
tutorial::Person out_msg;
// Deserialize using the status return API
if (!read_json(out_msg, json).ok()) {
// Handle error
}
// Deserialize using the expected return API
auto read_result = read_json<tutorial::Person>(json);
assert(read_result.value() == out_msg);
Non-owning Mode
#include "addressbook.glz.hpp" // Include the "*.glz.hpp" for JSON APIs
// ....
std::pmr::monotonic_buffer_resource pool;
std::string in_json = R"({"name":"john"})";
tutorial::Person out_person;
if (!read_json(out_person, in_json, alloc_from{pool}).ok()) {
// Handle error
}
// alternatively, use the overload returning an expected object
auto read_result = read_json<tutorial::Person>(in_json, alloc_from{pool});
assert(read_result.value() == out_person);