-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-2809] Support ref
in foreign key constraint expressions
#8062
Comments
ref
in foreign key constraint expressionsref
in foreign key constraint expressions
An argument in favor of prioritizing this is that BigQuery now supports the use of foreign keys for optimizing joins. |
I would also submit that, database enforcement implementation aside, forcing the usage of explicit |
Snowflake can also use foreign keys for optimizing joins: |
I'd really be interested in referencing a FK constraint to a model that lives in a custom schema. The referred model lives in a custom schema that is dependent on an Environment Variable that is passed in at runtime, so I cannot hardcode a Until dbt is enhanced to support |
Another reason to add this is to ensure that dbt builds DAG dependencies that support the foreign keys. Because there is no For example, let’s say I have 3 models:
So if I say So far so good. Now, suppose I specify a foreign key constraint on a column in But with that constraint specified, In any non-trivial sized DAG, this will cause constant errors in builds, because there is no guarantee of a thread getting to The workaround is to force the dependency by placing a SQL-commented
But this is just more extra work, and it becomes difficult to maintain as it scales. So this is one more reason to support |
Like Snowflake and BigQuery, Redshift also uses foreign keys for optimizing joins: |
During development we build into developer dependent datasets (e.g. |
The dependency issue raised by @noahjgreen295 will still be an issue and was a major issue for us in using this feature. Our pipelines were less reliable and there was essentially a race condition when running multiple models in parallel. I use a similar naming convention to you and I used something like this in the model YAML - type: foreign_key
expression: "{{ 'warehouse' if target.name!='dev' else target.dataset }}.tableA(tableB_ForeignKey)" you can define simple if-else logic in the brackets. This allows for the FKs to be created in a |
Thanks @Stochastic-Squirrel, I didn't realise you could do that; ends up something like this for ours and does indeed work, leaving the logic duplication (this is already handled in the naming macros that - type: foreign_key
expression: "{{ 'prod_dataset.' if target.name!='dev' else target.dataset ~ '.prod_dataset__' }}foreign_table(foreign_key)" Another option might be a post hooks alterations with alter table statements, but also not ideal. |
@jtcohen6 Given that Snowflake, Redshift and BigQuery use foreign keys to optimize joins, will this issue get re-prioritized? Also, I'll add that downstream tools can use PK/FK to infer table relationships, perhaps bumping the priority further. |
Any updates on the priority for this? I feel like dbt focus a lot in adding new features but pushes aside the improvement of great features already present... |
Any updates on this? It defeats the purpose of foreign key constraints as we cannot use them because it seems that dbt is unable to build a correct DAG. I have to run the project a couple of times so that parent tables get built. |
+1 for this functionality I am currently using this workaround successfully. Specifying this in the in-sql config block, in the post_hook argument:
This applies the constraint only when the model is materialized for the first time, avoiding unnecessary runs. |
I would definitely appreciate a feature which fixes this open issue. I tried to implement the workaround below but found that my foreign table is built after the table I am defining the foreign key for. So, I had to go create the primary key for the foreign table manually in BigQuery in order for the production DAG to build properly.
|
The 'services' table has an implicit dependency to the 'structures' mart as its contracy enforces a check on its structure_id key towards that table. Make sure DBT knows about it so it generates the structures first. Referenced by dbt-labs/dbt-core#8062, might be fixed in dbt-labs/dbt-common#163 last week but: - not released - not documented - not sure the commit will actually help when I read it, needs more changes I suppose
The 'services' table has an implicit dependency to the 'structures' mart as its constraint enforces a check on its structure_id key towards that table. Make sure DBT knows about it so it generates the structures first. Referenced by dbt-labs/dbt-core#8062, might be fixed in dbt-labs/dbt-common#163 last week but: - not released - not documented - not sure the commit will actually help when I read it, needs more changes I suppose
The 'services' table has an implicit dependency to the 'structures' mart as its constraint enforces a check on its structure_id key towards that table. Make sure DBT knows about it so it generates the structures first. Referenced by dbt-labs/dbt-core#8062, might be fixed in dbt-labs/dbt-common#163 last week but: - not released - not documented - not sure the commit will actually help when I read it, needs more changes I suppose
The 'services' table has an implicit dependency to the 'structures' mart as its constraint enforces a check on its structure_id key towards that table. Make sure DBT knows about it so it generates the structures first. Referenced by dbt-labs/dbt-core#8062, might be fixed in dbt-labs/dbt-common#163 last week but: - not released - not documented - not sure the commit will actually help when I read it, needs more changes I suppose
The 'services' table has an implicit dependency to the 'structures' mart as its constraint enforces a check on its structure_id key towards that table. Make sure DBT knows about it so it generates the structures first. Referenced by dbt-labs/dbt-core#8062, might be fixed in dbt-labs/dbt-common#163 last week but: - not released - not documented - not sure the commit will actually help when I read it, needs more changes I suppose
The 'services' table has an implicit dependency to the 'structures' mart as its constraint enforces a check on its structure_id key towards that table. Make sure DBT knows about it so it generates the structures first. Referenced by dbt-labs/dbt-core#8062, might be fixed in dbt-labs/dbt-common#163 last week but: - not released - not documented - not sure the commit will actually help when I read it, needs more changes I suppose
Hello Team, I checked this pull request, it does not state that "ref" are now supported in the "expression" for constraints (foregin_key). It does not work in the following version, so is it planned to be fixed ? dbt --version
Plugins:
|
There has not been a release since before that PR was merged; did you check the code from that PR (which does show examples of |
@elyobo thanks i tried out using
I will try it later today with the beta 1.9 version today. |
Is this enabled for DBT Cloud as well? |
This will be available in the 1.9 release of dbt-core! If you're using dbt Cloud, you can access this early if you're running on versionless. Documentation will be added soon -> dbt-labs/docs.getdbt.com#5983 |
Problem
Because you must hard-code your database.schema.table name when setting a foreign key constraint:
This feature has become more important now that warehouses use foreign key constraints for better performance.
Instead, we should support
ref
in foreign key constraint expression - both at the model and column level.This is similar to how the relationships data test works.
Current workaround
Having to use jinja to specify the expression based on the target:
Acceptance criteria
ref
at the column levelNotes from technical refinement
originally left as comment in #7417
I'm opening this issue to track upvotes/comments that could inform eventual prioritization. Is this something people want/need in their production workflows? Are happy to solve by other means in the meantime (e.g.
dbt_constraints
)?If we were to take FK constraints more seriously, we're missing a pretty important ingredient, which is the ability to include & template ref inside the expression field — or providing more structure, i.e.
Per #6754 (comment), we kicked that out of scope for v1.5, and we're unlikely to prioritize it while this remains a metadata-only (nonfunctional & unenforceable) feature on the majority of data platforms.
The text was updated successfully, but these errors were encountered: