You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We noticed one table that we snapshot that when the source data changes a col from int to bigint but the downstream snapshot is not amended that the snapshot will just continuously insert the old record whenever the snapshot is run and not error that the new value cannot be inserted.
Steps To Reproduce
Snapshot a source model with a regular int column.
Change the source model column to bigint and change the value in the row to a bigint value larger than max int.
Run the dbt snapshot on the source table. The snapshot should have the check strategy and have the column that's been amended in that check list.
Expected behavior
A clear and concise description of what you expected to happen.
If the target dtype is int for a bigint value then there should be an error rather than a silent failure.
Screenshots and log output
If applicable, add screenshots or log output to help explain your problem.
System information
The output of dbt --version:
<output goes here>
The operating system you're using:
The output of python --version:
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
It looks like currently the merge statement that is executed will always try to cast the datatypes. Even if it fails to cast the data (for example string to int) it will insert null and there will be no error from the databricks side.
I would propose to check for schema changes in dbt, by using the check_for_schema_changes macro from dbt core. It is already being used to check for schema changes in incremental models.
@benc-db should this check just happen, or would it make sense to make it configurable? Similar to incremental models there could be an on_schema_change config option. It could be "ignore" by default to keep the current behavior, and only if it is set to "fail" check_for_schema_changes would be executed.
Without thinking about it too much, adding on_schema_change to snapshots sounds like a good idea. We're in the process of updating snapshots to add more features in general. I would post some comments on this issue. It collects all of the snapshot work being done. I don't know if this particular feature is included, but there are certainly schema-related items on there.
Describe the bug
We noticed one table that we snapshot that when the source data changes a col from int to bigint but the downstream snapshot is not amended that the snapshot will just continuously insert the old record whenever the snapshot is run and not error that the new value cannot be inserted.
Steps To Reproduce
Snapshot a source model with a regular int column.
Change the source model column to bigint and change the value in the row to a bigint value larger than max int.
Run the dbt snapshot on the source table. The snapshot should have the check strategy and have the column that's been amended in that check list.
Expected behavior
A clear and concise description of what you expected to happen.
If the target dtype is int for a bigint value then there should be an error rather than a silent failure.
Screenshots and log output
If applicable, add screenshots or log output to help explain your problem.
System information
The output of
dbt --version
:The operating system you're using:
The output of
python --version
:Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: