Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support recursive CTEs in the query interface #1441

Open
myyra opened this issue Oct 13, 2023 · 4 comments
Open

Support recursive CTEs in the query interface #1441

myyra opened this issue Oct 13, 2023 · 4 comments

Comments

@myyra
Copy link

myyra commented Oct 13, 2023

I tried to make this into a draft PR at least, but ran into so many dead ends that I didn't feel like opening something that would've required more fixing than reviewing... So I thought it would be better to first explain what I'm trying to achieve.

What did you do?

Tried to build a recursive CTE using the query interface / without writing SQL manually.

What did you expect to happen?

I could get a hold of the CTE "table", so that I can use it in the query interface and generate the recursive part of the CTE.

Given that it's already possible to get quite close, I hope we can find a way to build the whole thing using the query interface.

What happened instead?

Somewhat obviously, I can't access the CTE before it has been initialized. And since it needs the request in the initializer, it can't be constructed with a query that uses itself. This can be avoided by defining a CTE with the same name and empty request, and using it to build a request before giving it to the actual CTE.

Environment

GRDB flavor(s): GRDB
GRDB version: 6.20.0
Installation method: SPM
Xcode version: 15.0
Swift version: 5.9
Platform(s) running GRDB: All
macOS version running Xcode: 14.0

Demo Project

A minimal test case that can be added to CommonTableExpressionTests.swift, demonstrating the closest possible solution that can be achieved currently:

func testRecursiveCounterQuery() throws {
    try makeDatabaseQueue().read { db in
        let cteName = CommonTableExpression(named: "counter", literal: "")
        let recursive = cteName.all().select(Column("x") + 1)
        let cte = CommonTableExpression(
            recursive: true,
            named: "counter",
            literal: "SELECT 0 AS x UNION ALL \(recursive)"
        )
        XCTAssertEqual(
            try cte.all().with(cte).limit(5).fetchAll(db),
            [
                ["x": 0],
                ["x": 1],
                ["x": 2],
                ["x": 3],
                ["x": 4],
            ]
        )
    }
}

A slightly longer test case that is closer to the real-world problem I'm trying to solve:

Querying a DAG like structure
func testRecursiveDAG() throws {
    struct Node: Codable, FetchableRecord, PersistableRecord, Equatable {
        var id: Int64?
        
        static let parentEdges = hasMany(Edge.self, using: Edge.childForeignKey)
        static let childEdges = hasMany(Edge.self, using: Edge.parentForeignKey)

        static let parents = hasMany(
            Node.self,
            through: parentEdges,
            using: Edge.parent,
            key: "parent"
        )

        static let children = hasMany(
            Node.self,
            through: childEdges,
            using: Edge.child,
            key: "child"
        )
    }

    struct Edge: Codable, FetchableRecord, PersistableRecord, Equatable {
        let parentId: Int64
        let childId: Int64

        static let parentForeignKey = ForeignKey(["parentId"])
        static let childForeignKey = ForeignKey(["childId"])

        static let parent = belongsTo(Node.self, using: parentForeignKey)
        static let child = belongsTo(Node.self, using: childForeignKey)
    }

    try makeDatabaseQueue().write { db in

        try db.create(table: "node") { t in
            t.autoIncrementedPrimaryKey("id")
        }
        try db.create(table: "edge") { t in
            t.primaryKey {
                t.column("parentId", .integer)
                t.column("childId", .integer)
            }
        }

        let rootNode = try Node(id: 0).inserted(db)
        let node1 = try Node(id: 1).inserted(db)
        let node2 = try Node(id: 2).inserted(db)

        try Edge(parentId: rootNode.id!, childId: node1.id!).insert(db)
        try Edge(parentId: node1.id!, childId: node2.id!).insert(db)

        let cteNameOnly = CommonTableExpression(named: "ancestor", literal: "")

        let association = Node.association(to: cteNameOnly) { node, ancestor in
            node["id"] == ancestor["id"]
        }
        let cteAlias = TableAlias()

        let initialRequest = Node.filter(key: node2.id)
        let recursiveRequest = Node
            .joining(required: Node.children.joining(required: association.aliased(cteAlias)))

        let cte = CommonTableExpression<Node>(
            recursive: true,
            named: "ancestor",
            literal: SQL("\(initialRequest) UNION ALL \(recursiveRequest)")
        )

        let results = try cte.all().with(cte).fetchAll(db)
        XCTAssertEqual(results, [node2, node1, rootNode])
    }
}
@groue
Copy link
Owner

groue commented Oct 20, 2023

Hi @myyra,

I did not have the time to look at this issue. Please hold on a few more days.

@groue
Copy link
Owner

groue commented Dec 3, 2023

Hello again @myyra,

I tried to make this into a draft PR at least, but ran into so many dead ends that I didn't feel like opening something that would've required more fixing than reviewing...

If you would share some ideas and failed attempts, maybe I could help. Quite possibly the library needs some inner refactoring in order to make recursive CTEs possible, and it would help to identify the current blockers.

@myyra
Copy link
Author

myyra commented Dec 11, 2023

@groue absolutely, I should have done that in the first place... I'll need some time to reacquaint myself with all the details (and I always need a moment to wrap my head around recursive CTEs to begin with 😅) before providing a more in-depth look at the problems I ran into.

But from what I recall, the primary challenge was that I couldn't create an association to a CTE before fully initializing it, even though the association only ever needs the name (through CommonTableExpression.getter:relationForAll). And I think it was the dual structure of CommonTableExpression and SQLCTE that interfered with accessing the association methods/relationForAll in the CommonTableExpression initializers (my solution was a new initializer for recursive queries) because relationForAll gets the name from the SQLCTE.

It seems simple enough to solve when put like this, but I'm sure there is a good reason for the current structure and API that I can't recall now, and that was the root issue I ran into, no matter the direction I approached this from.

Quite possibly the library needs some inner refactoring in order to make recursive CTEs possible

Most likely, yeah. Given that altering the CTE structure might be a significant change, I'm curious about your thoughts on supporting this functionality. Not every use case needs to be accommodated, even though I'd prefer it 😁 Especially if it comes at the cost of complexity or API design.

From my perspective, we'd need a new initializer for CommonTableExpression, likely with a closure for building the recursive part, and a method to access the compound operators in code. These seem straightforward enough to me API-wise, especially the latter, but I don't have the complete picture from the library's point of view.

@groue
Copy link
Owner

groue commented Dec 11, 2023

A slight note about the pairs of dual types, so that the intent is more clear.

A few such pairs exist:

  • QueryInterfaceRequest / SQLRelation
  • CommonTableExpression / SQLCTE
  • Association / _SQLAssociation

In general, the public type is just a thin wrapper around the internal one. Its role is only to add some type-safety with Swift generics, so that we can help the compiler prevent some misuses. For example, a QueryInterfaceRequest<Player> accepts to be joined with BelongsToAssociation<Player, Team>, but not with BelongsToAssociation<Apple, Tree>. At the level of internal types, there's no safety guards: SQLRelation accepts any kind of _SQLAssociation.

So CommonTableExpression, paired with SQLCTE, is generic in order to bring some type safety as well, as described at the end of Common Table Expressions. It's probably overkill, but that's another debate 😅

To sum up: the internal type does all the "dirty" stuff with raw SQL, strings, other runtime values, and eventual runtime errors, while the public type uses as much static information as possible in order to prevent runtime errors from happening.

Quite surely, I have performed quite a few layering mistakes 😬. For example, CommonTableExpression.relationForAll should probably have been defined on SQLCTE instead. Or maybe it was defined in CommonTableExpression even before SQLCTE was invented, and I didn't move it. But that's a moot archeological question: there is no problem moving methods around until they reach the proper level, which is the level where they finally allow us to program ;-)

From my perspective, we'd need a new initializer for CommonTableExpression, likely with a closure for building the recursive part, and a method to access the compound operators in code. These seem straightforward enough to me API-wise, especially the latter, but I don't have the complete picture from the library's point of view.

All right, I'll come back with a more detailed response to this question. Thanks for asking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants