Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Easy serialization for POD types #104

Open
VelocityRa opened this issue Apr 14, 2023 · 12 comments
Open

Easy serialization for POD types #104

VelocityRa opened this issue Apr 14, 2023 · 12 comments

Comments

@VelocityRa
Copy link
Contributor

VelocityRa commented Apr 14, 2023

If a struct is simply a POD type ie. in my case a large struct with tons of trivially serializable fields, there shouldn't be a need to specify a serialize function tediously listing all of them (which is also error-prone). I know of brief_syntax but it doesn't really solve the problem.

Is this currently possible?


RPCS3's serialization system supports this very nicely (impl in this file), if you want an example.

Thanks for reading!

@fraillt
Copy link
Owner

fraillt commented May 8, 2023

Currently it's not possible, and I'm not planning to add this as core functionality.
I'm also not fan of macros as well, but it is possible to achieve this using structured binding with a little bit of template magic using SFINAE or more modern techniques like if constexpr or concepts, but this requires at least C++17.
Maybe I'll try to play this idea in the future and create an extension that is able to serialize/deserialize any POD type.

@VelocityRa
Copy link
Contributor Author

VelocityRa commented May 17, 2023

@fraillt
Can you be a bit more specific?
I don't get it, why would you need structured bindings / template stuff? Why would it not just be a memcpy or whatever using the sizeof the serialized type (like RPCS3)? I mean, the benefit of POD types is that you can just copy the raw bytes.

@fraillt
Copy link
Owner

fraillt commented May 19, 2023

Initially I thought that you want to deconstruct your object for each field, and serialize each field separately,
But if I understand correctly, you don't care about memory layout and endianess here, you just want memcpy entire struct.
In this case, it's quite trivial, I played a bit and came up with this code:

#include <bitsery/bitsery.h>
#include <bitsery/adapter/buffer.h>
#include <bitsery/traits/vector.h>
#include <vector>

// for implementing extension
#include <bitsery/traits/core/traits.h>

namespace bitsery {

namespace ext {

class TrivialCopy {
public:

  template<typename Ser, typename T, typename Fnc>
  void serialize(Ser &ser, const T &v, Fnc &&fnc) const {
    const void* addr = std::addressof(v);
    ser.adapter().template writeBuffer<1, uint8_t>(static_cast<const uint8_t*>(addr), sizeof(T));
  }

  template<typename Des, typename T, typename Fnc>
  void deserialize(Des &des, T &v, Fnc &&fnc) const {
    void* addr = std::addressof(v);
    des.adapter().template readBuffer<1, uint8_t>(static_cast<uint8_t*>(addr), sizeof(T));
  }

};
}

namespace traits {
template<typename T>
struct ExtensionTraits<ext::TrivialCopy, T> {
  using TValue = T;

  static_assert(std::is_trivially_copyable<T>::value, "Your type must be trivially_copyable");

  static constexpr bool SupportValueOverload = false;
  static constexpr bool SupportObjectOverload = false;
  // use lambda overload with empty lambda, because object overload expects
  // to have `serialize` method for `T`
  static constexpr bool SupportLambdaOverload = true;
};
}

}

struct MyData {
  uint16_t a{};
  uint16_t b{};
  int32_t c{};
  int64_t d{};
};

using Buffer = std::vector<uint8_t>;
using OutputAdapter = bitsery::OutputBufferAdapter<Buffer>;
using InputAdapter = bitsery::InputBufferAdapter<Buffer>;

int
main()
{
  auto data = MyData {
    32,
    8795,
    -786435,
    5849614964464
  };
  MyData res {};
  Buffer buffer {};

  // serialize
  bitsery::Serializer<OutputAdapter> ser{ buffer };
  // we need to pass any lambda here, to use lambda overload for `.ext` method
  // it looks a bit hackish, but solves the problem :)
  ser.ext(data, bitsery::ext::TrivialCopy{}, []() {});
  ser.adapter().flush();

  // deserialize
  bitsery::Deserializer<InputAdapter> des{ buffer.begin(),
                                     ser.adapter().writtenBytesCount() };
  des.ext(res, bitsery::ext::TrivialCopy{}, []() {});

  // verify
  assert(data.a == res.a && data.b == res.b && data.c == res.c && data.d == res.d);
  return 0;
}

@fraillt
Copy link
Owner

fraillt commented May 19, 2023

This is what I initially though of.
To run it you need C++20 (it's possible to do it with C++17, but would require more boilerplate.
Modify CMakeLists.txt

set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

And here's the actual code:

#include <bitsery/bitsery.h>
#include <bitsery/adapter/buffer.h>
#include <bitsery/traits/vector.h>
#include <bitsery/brief_syntax.h>
#include <vector>

// this will be converted into any type, that we need
struct any
{
  template <typename T>
  operator T();
};

template <typename Ser, typename T>
void serialize_aggregate(Ser &ser, T& v) {
  static_assert(std::is_aggregate_v<T>, "only aggregate types can be supported");
  if constexpr ( requires { T { any{}, any{}, any{}, any{} }; } ) {
    auto && [a1, a2, a3, a4] = v;
    ser(a1, a2, a3, a4);
  } else if constexpr ( requires { T { any{}, any{}, any{} }; } ) {
    auto && [a1, a2, a3] = v;
    ser(a1, a2, a3);
  } else if constexpr ( requires { T { any{}, any{} }; } ) {
    auto && [a1, a2] = v;
    ser(a1, a2);
  } else if constexpr ( requires { T { any {}}; }) {
    auto && [a1] = v;
    ser(a1);
  } else {
    // since we assert at the top, that it's aggregate, this will trigger all the time
    static_assert(!std::is_aggregate_v<T>, "only supports struct up to 4 fields");
  }
}

enum MyEnum {
  A,
  B,
  C
};

struct MyData {
  uint16_t a{};
  uint16_t b{};
  int32_t c{};
  MyEnum d{};
};

using Buffer = std::vector<uint8_t>;
using OutputAdapter = bitsery::OutputBufferAdapter<Buffer>;
using InputAdapter = bitsery::InputBufferAdapter<Buffer>;

int
main()
{
  auto data = MyData {
    32,
    8795,
    -786435,
    MyEnum::B,
  };
  MyData res {};
  Buffer buffer {};

  // serialize
  bitsery::Serializer<OutputAdapter> ser{ buffer };
  serialize_aggregate(ser, data);
  ser.adapter().flush();

  // deserialize
  bitsery::Deserializer<InputAdapter> des{ buffer.begin(),
                                     ser.adapter().writtenBytesCount() };
  serialize_aggregate(des, res);

  // verify
  assert(data.a == res.a && data.b == res.b && data.c == res.c && data.d == res.d);
  return 0;
}

@fraillt
Copy link
Owner

fraillt commented May 19, 2023

... and a bit more complete solution :)

#include <bitsery/bitsery.h>
#include <bitsery/adapter/buffer.h>
#include <bitsery/traits/vector.h>
#include <bitsery/brief_syntax.h>
#include <vector>

// this will be used to construct a any field for an aggregate in `requires` clause
struct any
{
  template <typename T>
  operator T();
};

// forward declarations
template <typename Ser, typename T>
void serialize_aggregate(Ser &ser, T& v);

// to stop variadic expansion
template <typename Ser>
void serialize_impl(Ser& ser) {
}

template <typename Ser, typename T, typename ...Ts>
void serialize_impl(Ser& ser, T& v, Ts&& ...rest) {
  if constexpr ( std::is_aggregate_v<T>) {
    serialize_aggregate(ser, v);
  } else {
    ser(v);
  }
  serialize_impl(ser, std::forward<Ts>(rest)...);
}

template <typename Ser, typename T>
void serialize_aggregate(Ser &ser, T& v) {
  static_assert(std::is_aggregate_v<T>, "only aggregate types can be supported");
  if constexpr ( requires { T { any{}, any{}, any{}, any{} }; } ) {
    auto && [a1, a2, a3, a4] = v;
    serialize_impl(ser, a1, a2, a3, a4);
  } else if constexpr ( requires { T { any{}, any{}, any{} }; } ) {
    auto && [a1, a2, a3] = v;
    serialize_impl(ser, a1, a2, a3);
  } else if constexpr ( requires { T { any{}, any{} }; } ) {
    auto && [a1, a2] = v;
    serialize_impl(ser, a1, a2);
  } else if constexpr ( requires { T { any {}}; }) {
    auto && [a1] = v;
    serialize_impl(ser, a1);
  } else {
    // since we assert at the top, that it's aggregate, this will trigger all the time
    static_assert(!std::is_aggregate_v<T>, "only supports struct up to 4 fields");
  }
}

struct Xxx {
  int x;
  int32_t z;
};

struct MyData {
  uint16_t a{};
  uint16_t b{};
  int32_t c[2];
  Xxx d{};
};

using Buffer = std::vector<uint8_t>;
using OutputAdapter = bitsery::OutputBufferAdapter<Buffer>;
using InputAdapter = bitsery::InputBufferAdapter<Buffer>;

int
main()
{
  auto data = MyData {
    32,
    8795,
      {-786435, 23423},
    Xxx { 3, 454 },
  };
  MyData res {};
  Buffer buffer {};

  // serialize
  bitsery::Serializer<OutputAdapter> ser{ buffer };
  serialize_aggregate(ser, data);
  ser.adapter().flush();

  // deserialize
  bitsery::Deserializer<InputAdapter> des{ buffer.begin(),
                                     ser.adapter().writtenBytesCount() };
  serialize_aggregate(des, res);

  // verify
  assert(data.a == res.a && data.b == res.b && data.c[1] == res.c[1] && data.d.x == res.d.x);
  return 0;
}

I hope that would help.

@eyalz800
Copy link

eyalz800 commented May 19, 2023

I didn’t check but I’m pretty sure it will not work if you remove one of the non array members in MyData due to brace elision support in aggregate initialization. Right now it works because 4 happens to be the max number of members and the structured binding works

@fraillt
Copy link
Owner

fraillt commented May 19, 2023

Hi, nice to see you here :)
I just tested, and it seams to work, if I remove any of the fields.
I think brace elision has nothing to do here, the real magic is the conversion operator.

  template <typename T>
  operator T();

As I understand T { any{}, ...} means that we first initialize/create any{}, and then it gets converted to any type that is needed for a specific field of T.
However, I agree that this is not complete solution, and rather limited, but I hope that this still might be useful for @VelocityRa :)

@VelocityRa
Copy link
Contributor Author

Whoa, thanks so much for taking the time to write this!
I'll see if I can use it. MIT licensed right?

@fraillt
Copy link
Owner

fraillt commented May 19, 2023

Have fun ;)
I was having fun too, and I should mention that @eyalz800 was the one who showed me that this is possible with modern C++ so thanks for him as well :)

@eyalz800
Copy link

eyalz800 commented May 19, 2023

@fraillt how does it work for you with the following:

struct MyData {
  uint16_t a{};
  int32_t c[2];
  Xxx d{};
};

In the case above I expect T{any{},any{},any{},any{}} to work and structured binding to fail with 4 members (because there are only 3). Because of the brace elision the middle two “any”s go to the int array.

@fraillt
Copy link
Owner

fraillt commented May 20, 2023

Yep, you're right.
I tried, various ways, and came up with this idea,

if constexpr ( requires { [](T& v){ auto & [a1, a2, a3, a4] = v;}; } ) {
...
}

it works on Clang, but not on GCC, although I guess it should work...
Why C++ needs to be so complex... :)

@eyalz800
Copy link

eyalz800 commented May 21, 2023

What you did with the lambda is exactly on my library read me - github.com/eyalz800/zpp_bits documentation and it’s an area where the standard I think is not fully clear but the intent is more towards to hard error this because the error does not happen in the immediate context and therefore doesn’t count for SFINAE so gcc is probably correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants