Description
I am converting a series of capnp messages into a columnar format (arrow specifically). One of the challenges with unions is non-active union fields. I recursively create a vector of dynamic value readers for each field and then convert that into arrays of arrow memory. When the field is a member of a union and active this works fine. When the field is not active then this creates fake data instead of null.
For example the schema:
struct TestUnion {
union {
foo @0 :UInt16;
bar @1 :UInt32;
}
}
With data: [{"foo": 1}, {"bar": 1}]
generates the output: [{"foo": 1, "bar": 0}, {"foo": 0, "bar": 1}]
. What I would expect is [{"foo": 1, "bar": null}, {"foo": null, "bar": 1}]
. I have tried creating dynamic_value::Reader::Void
when the field is non-active, but this is challenge with nested struct and list types.
For structs I have tried creating a new empty dynamic_struct::StructReader
using the private layout:
match capnp_field.get_type().which() {
introspect::TypeVariant::Struct(st) => {
dynamic_value::Reader::Struct(dynamic_struct::Reader::new(layout::StructReader::new_default(), schema::StructSchema::new(st)))
}
}
This still leads to primitive ints with 0
value.
Is it possible to create readers with null values?
Would it make sense to have non-active union fields have null values (I assume the expectation is users check has
to find active values and ignore non-active values)?