-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-313] Package new adapters tests #4812
Comments
@gshank and I just had a chance to talk through this. Notes from the conversation: Requirements / user stories
My instincts
What's included in the package?
|
Thinking through two of the open questions mentioned above: Interface boundary: test utilities vs. test casesWe feel good about this interface boundary:
For many many tests of standard A subset of the test cases described in (2) do need to be packaged up, because we want to test the same functionality against many (if not all) of the database adapters that live external to this repository. I've called this the "adapter zone" :) It's conceivable that we'd want to move tests, originally written for one adapter plugin, into the "adapter zone" if we realize that the underlying behavior is similar on another adapter, and we want to guarantee that it works the same, without needing to copy-paste-edit the code for it. So, we want to bundle up those reusable "adapter zone" test cases, and make them available as a standalone Python package. Where should the code for those reused test cases live?There are two options:
There are pros and cons to both approaches. We should do our best to list them all out and assess. I think we can also tackle this question from two angles: philosophical and practical. Philosophically: Do these test cases represent a totally separate concern from
Practically, here are the main pain points to consider:
For the time being, while we're assessing our options:
|
I largely agree with the characterization you've put forth @jtcohen6 but I'd like to present a slightly different way to think of it. The current way to describe the sql support of dbt is something like "it works if there's an adapter for it". Which is a fine abstraction for end users and compatibility questions, but when it comes to testing it leaves a lot un DRYed because each database has a very large amount of overlap in functionality. The way to DRY these tests up is to determine the largest subset of SQL that is covered by all official adapters. This subset can then be used to create a test suite that any database adapter can use to ensure basic functionality. Functionality beyond this SQL subset would be the responsibility of the adapter itself to test. Most sql subsets/dialects have names and years to provide versioning like If this approach is taken, I think the structure falls into place pretty easily. Over time the |
@iknox-fa Appreciate that perspective! I'm aligned with many parts of your vision. At the risk of waxing even more philosophical... From my point of view, a lot of the "very large amount of overlap in functionality" is already captured and implemented within
(We want to capture even more of that overlap, by moving some of the common reusable building blocks from We've run into issues before by assuming these are a larger subset than they are, and over the past few years we've needed to significantly expand the scope of where an adapter plugin can reasonably intervene / reimeplement / override / disable that presumed subset of behavior. There is actually a tremendous amount of difference between the support offered by ANSI SQL, Hive SQL, T-SQL, and other "dialects" (are they really even dialects?). Some analytical data platforms have lacked full SQL support for accessing metadata, managing objects (DDL/DML), operating permissions (DCL); each has required hitting other APIs. What do we say if the mountains will not come to the Prophet? In all cases, the right answer has been: Let's find a way to get dbt working on this data platform, by defining a clear-yet-flexible interface for all the things that might differ across databases. It's the role of the adapter creator / data platform expert to match up the square-ish pegs and the round-ish holes as best as they can. What we've said historically, and continue to say today: The more your database functions like the accepted standard bits of Postgres/PostgreSQL, the easier it will be to write your adapter, and the easier it will be for users of your adapter to get dbt ecosystem packages "for free" (no shimming/reimplementing of low-level macros required). The more your data platform differs from a Postgres-like database, the more your work is cut out for you—but it's still possible. Counting lines of code is a highly imperfect proxy for complexity. Still, I think these differences can be instructive:
(A lot of that Jinja-SQL is still repeated unnecessarily, and we should always be on the lookout for ways to move more of that logic into the defaults, and enable more modular overrides when small things differ. As a tiny example: #4488) Here's where the rubber hits the road, for this issue: We make changes to the "default" interface / SQL support within
To avoid adding stumbling blocks into our development/contribution process, I'm inclined to think that we should be defining the test cases in the same codebase where we define the default implementations/interface that they are testing. Today, that "default adapter" lives in the |
We want to be able to package the new adapter tests and have 1 internal adapter be able to use them
The text was updated successfully, but these errors were encountered: