You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
This is probably halfway between a bug and a feature, I can't find documented anywhere that this isn't supported, and the docs suggest it should be possible, but it's also not something that I guess has ever worked.
The delete+insert strategy when a list of unique keys is provided uses the delete from X using Y syntax and then attaches the predicates further down in the condition. However, because the predicate may reference a column that exists in both tables (which in most cases it would), this fails due to an ambiguous column reference (at least on redshift+postgres, i haven't yet tested other warehouses).
The docs suggest that you can use DBT_INTERNAL_DEST and DBT_INTERNAL_SOURCE values for the merge strategy, but there is no equivalent for the delete+insert. This means it is not possible to use incremental predicates with the delete+insert strategy with a list-based unique key.
I do acknowledge that this is largely due to the fact that aliasing tables in a delete statement is in many cases not actually supported, however I believe there are potential solutions to this (none of which are perfect) discussed below.
Expected Behavior
I would expect the delete+insert strategy to support some way to use incremental predicates while using a list-based unique key field.
One way this could be achieved would be to still support the use of DBT_INTERNAL_DEST and DBT_INTERNAL_SOURCE in the incremental predicates, but to replace these with the absolute target and source table names at compile time in the macro before returning the SQL. This isn't ideal as you're actively editing the predicates that were provided, but would ensure a consistent behaviour between strategies.
Steps To Reproduce
New project, add a seed file called my_seed.csv with the following data
Create a model called my_modelsql` with the following content
{{
config(
materialized='incremental',
incremental_strategy='delete+insert',
unique_key=['id', 'id2'],
upsert_date_key='start_tstamp',
incremental_predicates= ["start_tstamp > date '2021-01-01'"]
)
}}
with data as (
select * from {{ ref('my_seed') }}
)
{%ifis_incremental() %}
select
id,
id2,
start_tstamp
from data
where run = 2
{%else%}
select
id,
id2,
start_tstamp
from data
where run = 1
{%endif%}
dbt seed, dbt run, dbt run again. On the second run you should get the following error
column reference "start_tstamp" is ambiguous
LINE 17: and start_tstamp > date '2021-01-01'
^
compiled Code at target/run/dbt_demo/models/my_models.sql
Relevant log output
No response
Environment
- OS: MacOSx
- Python: 3.9.14
- dbt-adapter: postgres 1.7.11 (although this issue likely applied in other warehouses)
Additional Context
No response
The text was updated successfully, but these errors were encountered:
rlh1994
changed the title
[Bug] Delete+insert incremental strategy doesn't support incremental predicated when a list of unique keys is used
[Bug] Delete+insert incremental strategy doesn't support incremental predicates when a list of unique keys is used
May 28, 2024
Is this a new bug?
Current Behavior
This is probably halfway between a bug and a feature, I can't find documented anywhere that this isn't supported, and the docs suggest it should be possible, but it's also not something that I guess has ever worked.
The delete+insert strategy when a list of unique keys is provided uses the
delete from X using Y syntax
and then attaches the predicates further down in the condition. However, because the predicate may reference a column that exists in both tables (which in most cases it would), this fails due to an ambiguous column reference (at least on redshift+postgres, i haven't yet tested other warehouses).https://github.com/dbt-labs/dbt-adapters/blob/a2292c8ec76e4c9f03869ad95817a2ad82dfb34b/dbt/include/global_project/macros/materializations/models/incremental/merge.sql#L65:L77
The docs suggest that you can use
DBT_INTERNAL_DEST
andDBT_INTERNAL_SOURCE
values for the merge strategy, but there is no equivalent for the delete+insert. This means it is not possible to use incremental predicates with the delete+insert strategy with a list-based unique key.I do acknowledge that this is largely due to the fact that aliasing tables in a delete statement is in many cases not actually supported, however I believe there are potential solutions to this (none of which are perfect) discussed below.
Expected Behavior
I would expect the delete+insert strategy to support some way to use incremental predicates while using a list-based unique key field.
One way this could be achieved would be to still support the use of
DBT_INTERNAL_DEST
andDBT_INTERNAL_SOURCE
in the incremental predicates, but to replace these with the absolute target and source table names at compile time in the macro before returning the SQL. This isn't ideal as you're actively editing the predicates that were provided, but would ensure a consistent behaviour between strategies.Steps To Reproduce
my_seed.csv
with the following datamy_model
sql` with the following contentdbt seed
,dbt run
,dbt run
again. On the second run you should get the following errorRelevant log output
No response
Environment
Additional Context
No response
The text was updated successfully, but these errors were encountered: