Skip to content

Commit

Permalink
allow for iceberg element-ids take 2
Browse files Browse the repository at this point in the history
Iceberg manifests need to be able to write their JSONified Avro schemas
in the Avro headers for certain Iceberg clients (e.g. PyIceberg) to be
able to read them.

Commit 966d457 was a stab at allowing
this by hacking this custom field into the AST, but it did so as a
string, instead of a long. This isn't exactly what other libraries
output, but is sufficient for PyIceberg. To avoid surprises down the
line, this quick hack switches us to serialize the long.

(cherry picked from commit b2b82a4)
  • Loading branch information
andrwng authored and pgellert committed Aug 12, 2024
1 parent cb32bf8 commit fbc8385
Show file tree
Hide file tree
Showing 4 changed files with 7 additions and 6 deletions.
2 changes: 1 addition & 1 deletion lang/c++/impl/Compiler.cc
Original file line number Diff line number Diff line change
Expand Up @@ -439,7 +439,7 @@ static NodePtr makeArrayNode(const Entity &e, const Object &m,
asSingleAttribute(makeNode(it->second, st, ns)));
NodePtr node = NodePtr(nodePointer);
if (containsField(m, "element-id")) {
nodePointer->elementId_ = getStringField(e, m, "element-id");
nodePointer->elementId_ = getLongField(e, m, "element-id");
}
if (containsField(m, "doc")) {
node->setDoc(getDocField(e, m));
Expand Down
6 changes: 3 additions & 3 deletions lang/c++/impl/NodeImpl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -582,9 +582,9 @@ void NodeArray::printJson(std::ostream &os, size_t depth) const {
logicalType().printJson(os);
os << ",\n";
}
if (!elementId_.empty()) {
os << indent(depth + 1) << R"("element-id": ")"
<< escape(elementId_) << "\",\n";
if (elementId_.has_value()) {
os << indent(depth + 1) << "\"element-id\": "
<< *elementId_ << ",\n";
}
os << indent(depth + 1) << "\"items\": ";
leafAttributes_.get()->printJson(os, depth + 1);
Expand Down
3 changes: 2 additions & 1 deletion lang/c++/include/avro/NodeImpl.hh
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include <iostream>
#include <limits>
#include <memory>
#include <optional>
#include <set>
#include <sstream>
#include <utility>
Expand Down Expand Up @@ -390,7 +391,7 @@ public:

void printDefaultToJson(const GenericDatum &g, std::ostream &os, size_t depth) const override;

std::string elementId_;
std::optional<int64_t> elementId_;
};

class AVRO_DECL NodeMap : public NodeImplMap {
Expand Down
2 changes: 1 addition & 1 deletion lang/c++/test/SchemaTests.cc
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ const char *roundTripSchemas[] = {
R"({"type":"array","items":"long"})",
"{\"type\":\"array\",\"items\":{\"type\":\"enum\","
"\"name\":\"Test\",\"symbols\":[\"A\",\"B\"]}}",
R"({"type":"array","element-id":"testElemId","items":"long"})",
R"({"type":"array","element-id":42,"items":"long"})",

// Map
R"({"type":"map","values":"long"})",
Expand Down

0 comments on commit fbc8385

Please sign in to comment.