Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cycles are problematic in TreeBuilder V2 #293

Open
JakeHillion opened this issue Aug 18, 2023 · 1 comment
Open

Cycles are problematic in TreeBuilder V2 #293

JakeHillion opened this issue Aug 18, 2023 · 1 comment
Assignees
Labels
bug Something isn't working codegen Code Generation Framework library An issue specific to the OI Library (OIL).

Comments

@JakeHillion
Copy link
Contributor

The method for statically typing traversal functions in the generated code cannot handle cycles. This means that we must operate on the graph before generating code to hide these cycles from the static type. Without this any type cycles such as:

template <typename T>
struct LinkedListNode {
  T el;
  struct LinkedListNode* next;
};

Will fail at compilation of the JIT code.

@JakeHillion JakeHillion added bug Something isn't working codegen Code Generation Framework library An issue specific to the OI Library (OIL). labels Aug 18, 2023
@JakeHillion JakeHillion self-assigned this Aug 18, 2023
@JakeHillion
Copy link
Contributor Author

I've closed my PR for this issue as it was incomplete and a mess. @tyroguru and I discussed a multi-stage approach for making progress on this.

  1. Detect cycles that won't succeed at codegen and emit a human readable error. This would be a relatively simple TypeGraph pass that detects if it is not a Directed Acyclic Graph (DAG). Basically any cycle in the Type Graph will cause compilation of the generated code to fail with a hard to read template evaluation error. Instead, we could emit errors before CodeGen describing the exact type cycle(s). This would be a huge improvement and would allow for manual type stubbing.

  2. Stub out a type in the now detected cycles. We can take the cycles detected in the previous pass and stub out a type in them. There is probably a sensible way to do this in the presence of multiple cycles to stub the type that will break the most cycles in one go. The types that are stubbed can then be emitted as warnings and shown as such in the output to give visibility of the cycle without breaking the build.

  3. The ultimate solution: alter the templated static types so they can compile and build this into the processing of the output. This is pretty involved - for now, completing up to step 2 and then making failing types either containers or knowingly stubbed will work pretty well. As I'm stepping away from the project I'm doing a full knowledge dump on how I see the theoretical solution to this problem here:

The Problem

The previous steps hid the symptoms of the problem (failed compilation) but solving the problem properly requires a better understanding. Here we will look at exactly the generated code for an example data structure and why this code can't compile, then develop the solution. The type in question will be this:

template <typename T>
struct LinkedListNode {
  ~LinkedListNode();
  
  T head;
  LinkedListNode<T>* tail = nullptr;
};

If we understand this type, which contains a cycle, as a container, we generate code such as the following:

template <typename Ctx, typename T>
struct TypeHandler<LinkedListNode<T0>> {
  using type = types::st::List<DB,
      TypeHandler<Ctx, T0>::type
  >;
...
};

This works fine. We have converted the cycle into a List by using our understanding of the intent of the type - to hold a list of objects. If instead we must reconstruct the type, we generate something like this:

struct LinkedListNode_0 {
  int head_0;
  LinkedListNode_0* tail_1;
};

template <typename Ctx>
struct TypeHandler<LinkedListNode_0> {
  using type = types::st::Pair<DB,
      TypeHandler<Ctx, int>::type,
      TypeHandler<Ctx, LinkedListNode_0*>::type
  >;
...
};

The using type = statement in the reconstructed type is invalid. After evaluating the pointer into a Sum<Unit, LinkedListNode_0> this declaration has a type cycle - to declare TypeHandler<LinkedListNode_0>::type we require TypeHandler<LinkedListNode_0>::type.

Static Types: Hiding the cycle

To maintain the templated static type system, which otherwise works very well, we must make this cycle disappear. Fortunately, because we are a code generator, this can be achieved by evaluating this cycle at generation time rather than compile time. Given that we have already identified, stubbed, and tested the code for identifying cycles in Steps 1 & 2 this shouldn't be too bad...

We must create a Type that has the same representation in C++ as the real type but is a different type such that we can break the cycle. This is very similar to the CaptureKeys type - it represents the underlying type but knows to handle it differently.

The steps here are pretty simple:

  1. Create a generated C++ type that has the same memory representations as the type but with a different name (as CaptureKeys), here called CycleBreaker.
  2. Create a static type within this that can be created without the cyclic dependency (something that holds a DB).
  3. In the getSizeType function give the raw DB which will be casted to the cyclic type appropriately and static cast to the underlying type.

This may seem like it's doing nothing different: the generated C++ code still has full knowledge of the cycle. But simply allowing a type to be created solves the cycles. The final codegen for our cyclic type might look something like this:

struct LinkedListNode_0 {
  int head_0;
  CycleBreaker_1* tail_1;
};

struct CycleBreaker_1 : public LinkedListNode_0 {};

template <typename Ctx>
struct TypeHandler<Ctx, CycleBreaker_1> {
  using DB = typename Ctx::DataBuffer;
  struct type {
    type(DB buf) : buf_{buf} {}
    DB buf_;
  };

  void getSizeType(const CycleBreaker_1& cb, typename TypeHandler<Ctx, CycleBreaker_1>::type ty) {
    return getSizeType<Ctx>(ty.buf_, static_cast<const LinkedListNode_0&>(cb));
  }
};

This should basically work (with the caveat that I haven't even close to tested it). The bits left to the reader's imagination are picking the right point in the cycle to break it, and whether this usage can cover all cases. The CycleBreaker described here has a dependency on the LinkedListNode_0 it represents and thus must be topo sorted after it. Are there any cases where this isn't possible? These are issues which are likely to come up during implementation.

Dynamic Types: Allow cycles

Dynamic Types, on the other hand, have no problem representing cycles. However the default method of constructing them directly from the static representation won't work because the static representation would have cycles (not allowed). We can follow on from the previous generated code to make this work quite easily by modifying the type in the TypeHandler:

struct type {
  static constexpr types::dy::Dynamic describe = TypeHandler<Ctx, LinkedListNode_0>::type::describe;
  type(DB buf) : buf_{buf} {}
  DB buf_;
};

The reason this should work is types::dy::Dynamic is both copy constructible and holds a reference wrapper to another dynamic type. Cyclic references in types are fine and work (in my experience) correctly in a constexpr context. This should be it for Dynamic Types.

TreeBuilder Instructions: Allow cycles

TreeBuilder instructions consist of fields and processors which are also all defined at compile time. It seems that we can cheat slightly here by using a static constexpr reference to the underlying type which will already have been declared.

template <typename Ctx>
struct TypeHandler<Ctx, CycleBreaker_1> {
...
  static constexpr auto& fields = TypeHandler<Ctx, LinkedListNode_0>::fields;
  static constexpr auto& processors = TypeHandler<Ctx, LinkedListNode_0>::processors;
};

This should result in fully fixed cycles in TreeBuilder-v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working codegen Code Generation Framework library An issue specific to the OI Library (OIL).
Projects
None yet
Development

No branches or pull requests

1 participant