Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Heap Array support? #28

Open
Swoboderz opened this issue Oct 31, 2019 · 22 comments
Open

No Heap Array support? #28

Swoboderz opened this issue Oct 31, 2019 · 22 comments

Comments

@Swoboderz
Copy link

Swoboderz commented Oct 31, 2019

you have text1b(), which would be the closest thing to char array support, though generic heap arrays arent supported at all, not even an option like

s.ext(ObjectPtr, PointerOwner{PointerType::NotNull}, ObjectCount);
@Swoboderz Swoboderz reopened this Oct 31, 2019
@Swoboderz
Copy link
Author

Do correct me if im wrong, but it seems like something that should really be supported.

@fraillt
Copy link
Owner

fraillt commented Nov 6, 2019

You're right.
Currently, pointers as arrays are not supported :)
I'm planning to implement them in future, but cannot tell exact date yet, but pull requests are welcome :)

@VelocityRa
Copy link
Contributor

VelocityRa commented Apr 20, 2020

Any progress on this?

I have a void* of a specific size (known at runtime).
It's already allocated before (de)serializing, I only need the data memcpy in/out of there.
How can this be done?

I tried this as a quick and dirty way:

        if (_ptr) {
            for (auto i = 0 ; i < _size; ++i)
                s.value1b(*((U8*)_ptr + i));
        }

but my data is many GBs so this is (understandably) way too slow.

Edit: The cereal equivalent would be cereal::binary_data.

@fraillt
Copy link
Owner

fraillt commented Apr 21, 2020

Hi,
You can do it two ways,

  1. Write an extension.
  2. Write wrapper type for ptr and size and implement ContainerTraits for it.

I would prefer to write an extension, it has more customization and less code to write.
Here is an example:

namespace bitsery {

    namespace ext {

        template <typename TSize>
        class BufferData {
        public:

            explicit BufferData(TSize& size):_size{size} {}

            template<typename Ser, typename T, typename Fnc>
            void serialize(Ser &ser, const T &obj, Fnc &&fnc) const {
                auto& writer = ser.adapter();
                writer.template writeBytes<4>(static_cast<uint32_t>(_size));
                writer.template writeBuffer<1>(static_cast<const uint8_t*>(obj), _size);
            }

            template<typename Des, typename T, typename Fnc>
            void deserialize(Des &des, T &obj, Fnc &&fnc) const {
                auto& reader = des.adapter();
                uint32_t s=0u;
                reader.template readBytes<4>(s);
                _size = static_cast<TSize>(s);
                reader.template readBuffer<1>(static_cast<uint8_t*>(obj), _size);
            }

        private:
            TSize& _size;
        };
    }

    namespace traits {
        template<typename TSize, typename T>
        struct ExtensionTraits<ext::BufferData<TSize>, T> {
            using TValue = void;
            static constexpr bool SupportValueOverload = false;
            static constexpr bool SupportObjectOverload = true;
            static constexpr bool SupportLambdaOverload = false;
        };
    }

}

struct MyStruct {
    void* ptr;
    int ptr_size;
};

template <typename S>
void serialize(S& s, MyStruct& o) {
    s.ext(o.ptr, bitsery::ext::BufferData<int>(o.ptr_size));
}

@VelocityRa
Copy link
Contributor

@fraillt That's very helpful, thanks!

@VelocityRa
Copy link
Contributor

VelocityRa commented Apr 21, 2020

For our use case we don't even need to store the size. It doesn't change per-object.
The code looks pretty simple to modify so I'll do that myself :)

Do you plan to merge this, btw?

@fraillt
Copy link
Owner

fraillt commented Apr 22, 2020

No, there are no plans to meet it, simply because it's easy to implement, but there might be too many edge cases that doesn't fit for everyone anyway. Like in your case, you don't need size;) but hope that this example will be useful for others as well:)

@VelocityRa
Copy link
Contributor

Ok, though it's easy for you, you made the library :) For anyone else it'd take longer, for something that should be a basic feature (imo).

You can simply base it off of cereal::binary_data and it'll be fine.
Also, I don't really see any other use cases besides size + no size.
The "no size" alteration is optional and a lot of other people likely won't need it, so you can even omit it (or integrate it with a bool template parameter or something).

@fraillt
Copy link
Owner

fraillt commented Apr 23, 2020

I understand your view, and your points are valid.
I try to cover most standard types, so at least std would work seamlessly, and for other things, I included what seamed necessary for me ;) and those that are not trivial to write (e.g. pointer support), or bitsery itself had to be modified to support them (growable). Basically idea is to provide necessary tools, so you could extend further.
There are multiple things that might be useful, binary_data is one of them, the other is versioning and probably many more, what I want to do instead is maybe provide these as an example. I started some work here maybe I should write similar for binary_data as well, but for the moment, I have no plans to include them in the library.

@VelocityRa
Copy link
Contributor

Ok :) Thanks for your work.

@Getshi
Copy link

Getshi commented Oct 2, 2020

Hi, I have a similar issue. I would like to serialize Eigen matrices. My use case will include std::vectors of hundreds of matrices with thousands of rows.

Given the examples above I only managed to get something working like

template <typename Ser, typename T, typename Fnc>
void serialize(Ser &ser, const T &matrix, Fnc &&fnc) const
{
  uint32_t rows = matrix.rows();
  uint32_t cols = matrix.cols();
  uint32_t elems = rows * cols;

  auto &writer = ser.adapter();
  writer.template writeBytes<4>(static_cast<uint32_t>(rows));
  writer.template writeBytes<4>(static_cast<uint32_t>(cols));
  for (uint32_t i = 0; i < elems; ++i)
    ser.value4b(matrix.data()[i]);
}

I had to resort to the (according to comments above) slow version of writing separate values, because writeBuffer requires integer input. I tried throwing away platform-independence by using a reinterpret_cast but that didn't work (I assume because it might require a different provided buffer size).
Anyway, do you have suggestions for how to solve this nicely? FYI, the matrix.data() method exposes an underlying contiguous float*array. Maybe there is something different that I could do with container?

@Getshi
Copy link

Getshi commented Oct 3, 2020

Actually I think I got it to work with writeBuffer:

template <typename Ser, 
typename Scalar, int _Rows, int _Cols, int _Options, int _MaxRows, int _MaxCols, 
typename Fnc>
void serialize(Ser &ser, const ::Eigen::Matrix<Scalar, _Rows, _Cols, _Options, _MaxRows, _MaxCols> &matrix, Fnc &&fnc) const
{
  uint32_t rows = matrix.rows();
  uint32_t cols = matrix.cols();
  uint32_t elems = rows * cols;

  auto &writer = ser.adapter();
  writer.template writeBytes<4>(static_cast<uint32_t>(rows));
  writer.template writeBytes<4>(static_cast<uint32_t>(cols));

  static_assert(details::IsFundamentalType<Scalar>::value, "Value must be integral, float or enum type.");
  using TValue = typename details::IntegralFromFundamental<Scalar>::TValue;
  writer.template writeBuffer<sizeof(TValue)>(reinterpret_cast<const TValue *>(matrix.data()), elems);
}

This might be a stupid question and the wrong place to ask, but I see that writeBuffer then basically does a for loop to write each value - is this efficient / is there a way to code it such that it attempts to copy the entire buffer at once?

@fraillt
Copy link
Owner

fraillt commented Oct 3, 2020

Hello, I'm glad you make it work, I assume you don't need to deserialize this.
Regarding writeBuffer, it will only iterate though every value if your platform's endianess doesn't match with bitsery config (default is little endian). Otherwise it will call std::copy_n, which in turns calls std::copy, which in most know compilers calls memmove.
So unless you are using exotic hardware or compiler, it should be very efficient.

@Getshi
Copy link

Getshi commented Oct 4, 2020

Ok thanks for the clarification! I will be deserializing it too, I just didn't post the code since it's trivially the same, basically just replacing writeBytes and writeBuffer for readBytes and readBuffer. Out of curiosity, is the version I posted above platform independent? I just tried replicating what I saw in the bitsery code.

@fraillt
Copy link
Owner

fraillt commented Oct 5, 2020

If matrix element type is float, double or one of new fixed type int_32t, int_64t etc, then it's ok.
BTW, preferred way to do serialization and deserialization would be to use extensions. This would allow to define one serialize function that would be less error prone solution.

@Getshi
Copy link

Getshi commented Oct 5, 2020

Thanks, that's what I did. I'm posting the full code here in case it might help someone.

Click to expand for full code (serializing dense Eigen matrices)

#ifndef BITSERY_EXT_EIGEN_H
#define BITSERY_EXT_EIGEN_H

#include "../traits/core/traits.h"
#include "../details/adapter_common.h"
#include "../details/serialization_common.h"
#include <Eigen/Dense>
#include <Eigen/Core>

namespace bitsery
{
  namespace ext
  {
    namespace Eigen
    {
      class Matrix
      {
      public:
        template <typename Ser,
                  typename Scalar, int _Rows, int _Cols, int _Options, int _MaxRows, int _MaxCols,
                  typename Fnc>
        void serialize(Ser &ser, const ::Eigen::Matrix<Scalar, _Rows, _Cols, _Options, _MaxRows, _MaxCols> &matrix, Fnc &&fnc) const
        {
          uint32_t rows = matrix.rows();
          uint32_t cols = matrix.cols();
          uint32_t elems = rows * cols;

          auto &writer = ser.adapter();
          writer.template writeBytes<4>(static_cast<uint32_t>(rows));
          writer.template writeBytes<4>(static_cast<uint32_t>(cols));

          static_assert(details::IsFundamentalType<Scalar>::value, "Value must be integral, float or enum type.");
          using TValue = typename details::IntegralFromFundamental<Scalar>::TValue;
          writer.template writeBuffer<sizeof(TValue)>(reinterpret_cast<const TValue *>(matrix.data()), elems);
        }

        template <typename Des,
                  typename Scalar, int _Rows, int _Cols, int _Options, int _MaxRows, int _MaxCols,
                  typename Fnc>
        void deserialize(Des &des, ::Eigen::Matrix<Scalar, _Rows, _Cols, _Options, _MaxRows, _MaxCols> &matrix, Fnc &&fnc) const
        {
          auto &reader = des.adapter();
          uint32_t rows = 0u, cols = 0u;
          reader.template readBytes<4>(rows);
          reader.template readBytes<4>(cols);
          uint32_t elems = rows * cols;

          matrix.resize(rows, cols);
          static_assert(details::IsFundamentalType<Scalar>::value, "Value must be integral, float or enum type.");
          using TValue = typename details::IntegralFromFundamental<Scalar>::TValue;
          reader.template readBuffer<sizeof(TValue)>(reinterpret_cast<TValue *>(matrix.data()), elems);
        }

      private:
      };

    } // namespace Eigen
  }   // namespace ext

  namespace traits
  {
    template <typename T>
    struct ExtensionTraits<ext::Eigen::Matrix, T>
    {
      using TValue = void;
      static constexpr bool SupportValueOverload = false;
      static constexpr bool SupportObjectOverload = true;
      static constexpr bool SupportLambdaOverload = false;
    };
  } // namespace traits

} // namespace bitsery

#endif //BITSERY_EXT_EIGEN_H

Usage: e.g. ser.ext(matrix, ext::Eigen::Matrix{});

@fraillt
Copy link
Owner

fraillt commented Oct 6, 2020

Thanks for sharing!

@ChillstepCoder
Copy link

ChillstepCoder commented Feb 1, 2024

In case this is useful to anyone else, I was running into performance issues with vectors of structs where the structs were just PODs. These could really just be readBuffer/writeBuffer when they don't need to swap byte order, and then fall back to their serialize methods if they do. I made an extension for it, I haven't tested the validity of the data yet but i can confirm its the same size!

Click to expand
using BBuffer = std::vector<uint8_t>;
using BOutputAdapter = bitsery::OutputBufferAdapter<BBuffer>;
using BInputAdapter = bitsery::InputBufferAdapter<BBuffer>;

namespace bitsery
{
    namespace ext
    {
        // Extension for writing a vector of POD structs that have trivial serialize methods.
        // This greatly improves performance when we do not need to swap bytes, but will fall back
        // to the serialize method of the POD struct if we do need to swap.
        class PodStructVector
        {
        public:
            template <typename Ser,
                typename T,
                typename Fnc>
            void serialize(Ser& s, const std::vector<T>& vec, Fnc&&) const
            {
                auto& writer = s.adapter();
                writer.template writeBytes<4>(static_cast<uint32_t>(vec.size()));
                if constexpr (bitsery::details::ShouldSwap<typename BOutputAdapter::TConfig>{}) {
                    s.container(vec, vec.size());
                }
                else {
                    // Treat as an array of bytes
                    writer.template writeBuffer<1>(reinterpret_cast<const uint8_t*>(vec.data()), vec.size() * sizeof(T));
                }
            }

            template <typename Des,
                typename T,
                typename Fnc>
            void deserialize(Des& s, std::vector<T>& vec, Fnc&&) const
            {
                auto& reader = s.adapter();
                uint32_t count = 0u;
                reader.template readBytes<4>(count);
                vec.resize(count);

                if constexpr (bitsery::details::ShouldSwap<typename BInputAdapter::TConfig>{}) {
                    s.container(vec, count);
                }
                else {
                    // Treat as an array of bytes
                    reader.template readBuffer<1>(reinterpret_cast<uint8_t*>(vec.data()), count * sizeof(T));
                }
            }
        };
    } // namespace ext

    namespace traits
    {
        template <typename T>
        struct ExtensionTraits<ext::PodStructVector, T>
        {
            using TValue = void;
            static constexpr bool SupportValueOverload = false;
            static constexpr bool SupportObjectOverload = true;
            static constexpr bool SupportLambdaOverload = false;
        };
    } // namespace traits
} // namespace bitsery
Usage:

s.ext(myVectorOfStructs, bitsery::ext::PodStructVector{});

This concept should be able to be extended to heap arrays as well.

@gidigal
Copy link

gidigal commented Mar 14, 2024

Hello,

I tried to use class BufferData (BitseryBufferData.h) in my code:

#include <fstream>
#include <map>
#include <string>
#include <bitsery/bitsery.h>
#include <bitsery/adapter/stream.h>
#include <bitsery/ext/std_map.h>
#include <bitsery/ext/pointer.h>
#include "BitseryBufferData.h"

struct Data {
  int x;
  Data() { x = 0; }
  Data(int val) : x(val) {}

  friend class bitsery::Access;

  template<typename S>
  void serialize(S& s)
  {   
    s.value4b(x);
  }
};

class Test {
  Data _data[2][10];
public:
  
  friend class bitsery::Access;

  template<typename S>
  void serialize(S& s)
  {   
    int bufferSize = (int)(20*sizeof(Data));
    s.ext((uint8_t*)_data, bitsery::ext::BufferData<int>(bufferSize));
  }

};

class TestCache {
private:
	std::map<std::string, Test*> data;
public:
  TestCache() {  
    Test* t = new Test;
    data["test"] = t;			
  }
  
   friend class bitsery::Access;

  template<typename S>
  void serialize(S& s)
  {
    s.ext(data, bitsery::ext::StdMap{100}, [](S &s, std::string &key, Test *(&d)) {
	s.text1b(key, UINT16_MAX);
	s.ext(d, bitsery::ext::PointerObserver{bitsery::ext::PointerType::NotNull});
      });
  }
};



int main() {
  using Context = std::tuple<int, std::pair<uint32_t, uint32_t>>;
  using OutputBufferSerializer = bitsery::Serializer<bitsery::OutputBufferedStreamAdapter, Context>;

  Context ctx{};
  
  TestCache testCache;
  // open file stream for writing and reading
  auto fileName = "test_file.bin";
  std::fstream s{ fileName, s.binary | s.trunc | s.out };

  OutputBufferSerializer ser{ ctx, s };
  ser.object(testCache);

  ser.adapter().flush();
  s.close();
  return 0;
}

I received error:
bitsery-master/include/bitsery/details/serialization_common.h:445:5: error: static assertion failed: Invalid context cast. Context type doesn't exists.
Some functionality requires (de)seserializer to have specific context.
445 | !AssertExists,
| ^~~~~~~~~~~~~

@fraillt
Copy link
Owner

fraillt commented Mar 14, 2024

Hello,
I see that you use PointerObserver extension. All pointer extensions need PointerLinkingContext (see example raw_pointer.cpp).
Context is a way for some complicated extensions to have extra state, in this case Pointer linking context keeps track of all pointers during serialization/deserialization and makes sure that everything works:)

@gidigal
Copy link

gidigal commented Mar 14, 2024

Thanks @fraillt , I changed my context to be PointerLinkingContext and I also tried to change my call to text, due to error I got. I tried to change it according to tests/serialization_text.cpp.

My current code:

#include <fstream>
#include <map>
#include <string>
#include <bitsery/bitsery.h>
#include <bitsery/adapter/stream.h>
#include <bitsery/ext/std_map.h>
#include <bitsery/ext/pointer.h>
#include "BitseryBufferData.h"

struct Data {
  int x;
  Data() { x = 0; }
  Data(int val) : x(val) {}

  friend class bitsery::Access;

  template<typename S>
  void serialize(S& s)
  {   
    s.value4b(x);
  }

};

class Test {
  Data _data[2][10];
public:
  
  friend class bitsery::Access;

  template<typename S>
  void serialize(S& s)
  {   
    int bufferSize = (int)(20*sizeof(Data));
    s.ext((uint8_t*)_data, bitsery::ext::BufferData<int>(bufferSize));
  }

};

class TestCache {
private:
	std::map<std::string, Test*> data;
public:
  TestCache() {  
    Test* t = new Test;
    data["test"] = t;			
  }
  
  friend class bitsery::Access;

  template<typename S>
  void serialize(S& s)
  {   
    s.ext(data, bitsery::ext::StdMap{100}, [](S &s, std::string &key, Test *(&d)) {
	s.text<sizeof(std::string::value_type)>(key, 1000);
	s.ext(d, bitsery::ext::PointerObserver{bitsery::ext::PointerType::NotNull});
      });
  }
};



int main() {
  using OutputBufferSerializer = bitsery::Serializer<bitsery::OutputBufferedStreamAdapter, bitsery::ext::PointerLinkingContext>;

  bitsery::ext::PointerLinkingContext ctx{};
  
  TestCache testCache;
  // open file stream for writing and reading
  auto fileName = "test_file.bin";
  std::fstream s{ fileName, s.binary | s.trunc | s.out };

  OutputBufferSerializer ser{ ctx, s };
  ser.object(testCache);

  ser.adapter().flush();
  s.close();
  return 0;
}

Errors I receive:
context_usage.cpp:55:8: error: invalid operands of types '' and 'long unsigned int' to binary 'operator<'
55 | s.text<sizeof(std::string::value_type)>(key, 1000)

bitsery-master/include/bitsery/serializer.h:105:8: error: 'void bitsery::Serializer<TOutputAdapter, TContext>::ext(const T&, const Ext&, Fnc&&) [with T = std::map<std::__cxx11::basic_string, Test*>; Ext = bitsery::ext::StdMap; Fnc = TestCache::serialize(S&) [with S = bitsery::Serializer<bitsery::BasicBufferedOutputStreamAdapter<char, bitsery::DefaultConfig, std::char_traits >, bitsery::ext::PointerLinkingContext>]::<lambda(bitsery::Serializer<bitsery::BasicBufferedOutputStreamAdapter<char, bitsery::DefaultConfig, std::char_traits >, bitsery::ext::PointerLinkingContext>&, std::string&, Test*&)>; TOutputAdapter = bitsery::BasicBufferedOutputStreamAdapter<char, bitsery::DefaultConfig, std::char_traits >; TContext = bitsery::ext::PointerLinkingContext]', declared using local type 'TestCache::serialize(S&) [with S = bitsery::Serializer<bitsery::BasicBufferedOutputStreamAdapter<char, bitsery::DefaultConfig, std::char_traits >, bitsery::ext::PointerLinkingContext>]::<lambda(bitsery::Serializer<bitsery::BasicBufferedOutputStreamAdapter<char, bitsery::DefaultConfig, std::char_traits >, bitsery::ext::PointerLinkingContext>&, std::string&, Test*&)>', is used but never defined [-fpermissive]
105 | void ext(const T& obj, const Ext& extension, Fnc&& fnc)
| ^~~

@gidigal
Copy link

gidigal commented Mar 18, 2024

Hello @fraillt , I managed to solve the text issue. Now I am trying to work through deserialization with BufferData.
Below is my current code.

Errors:

  1. context_usage.cpp:39:5: error: no matching function for call to 'bitsery::Deserializer<bitsery::BasicInputStreamAdapter<char, bitsery::DefaultConfig, std::char_traits >, bitsery::ext::PointerLinkingContext>::ext(uint8_t*, bitsery::ext::BufferData)'
    39 | s.ext((uint8_t*)_data, bitsery::ext::BufferData(bufferSize));
  2. context_usage.cpp:39:21: error: cannot bind non-const lvalue reference of type 'unsigned char*&' to an rvalue of type 'uint8_t*' {aka 'unsigned char*'}
    39 | s.ext((uint8_t*)_data, bitsery::ext::BufferData(bufferSize));

Does the separation between serialize and deserialize functions in BufferData require separation in the class which uses this extension (class Test in my example) ? I'll be grateful for a sample code.

Thanks,
Gidi

#include <fstream>
#include <map>
#include <string>
#include <cstdint>
#include <bitsery/bitsery.h>
#include <bitsery/adapter/stream.h>
#include <bitsery/ext/std_map.h>
#include <bitsery/ext/pointer.h>
#include "BitseryBufferData.h"
#include <bitsery/traits/string.h>


struct Data {
  int x;
  Data() { x = 0; }
  Data(int val) : x(val) {}

  friend class bitsery::Access;

  template<typename S>
  void serialize(S& s)
  {   
    s.value4b(x);
  }

};

class Test {
  Data _data[2][10];
public:
  
  friend class bitsery::Access;


  template<typename S>
  void serialize(S& s)
  {   
    int bufferSize = (int)(20*sizeof(Data));
    s.ext((uint8_t*)_data, bitsery::ext::BufferData<int>(bufferSize));
  }


};

class TestCache {
private:
	std::map<std::string, Test*> data;
public:
  TestCache() {  
    Test* t = new Test;
    data["test"] = t;			
  }
  
  friend class bitsery::Access;

  template<typename S>
  void serialize(S& s)
  {   
    s.ext(data, bitsery::ext::StdMap{100}, [](S &s, std::string &key, Test *(&d)) {
	s.text1b(key, 256);
	s.ext(d, bitsery::ext::PointerObserver{bitsery::ext::PointerType::NotNull});
      });
  }
};



int main() {
  using OutputBufferSerializer = bitsery::Serializer<bitsery::OutputBufferedStreamAdapter, bitsery::ext::PointerLinkingContext>;
  using Deserializer = bitsery::Deserializer<bitsery::InputStreamAdapter, bitsery::ext::PointerLinkingContext>;
  

  bitsery::ext::PointerLinkingContext ctx{};
  
  TestCache testCache;
  TestCache res;
  // open file stream for writing and reading
  auto fileName = "test_file.bin";
  std::fstream s{ fileName, s.binary | s.trunc | s.out };

  OutputBufferSerializer ser{ ctx, s };
  ser.object(testCache);

  ser.adapter().flush();
  s.close();

  std::fstream s1{ fileName, s1.binary | s1.in };

  bitsery::ext::PointerLinkingContext ctx1{};


  Deserializer deser { ctx1, s1 };
  deser.object(res);

  s1.close();

  return 0;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants