Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: unrecognized _partitiontime for time ingestion partitioned models #74

Conversation

malikfm
Copy link
Contributor

@malikfm malikfm commented Jun 25, 2024

Description

Update incremental runner to support models with time ingestion partitioning.

Fixes #73

Checklist:

  • I have run make verify and fixed any linting or test errors
  • I have added appropriate unit tests or if applicable an integration test
  • OPTIONAL: I have run make integration against a Big Query instance

mode=BigQueryFieldMode.NULLABLE,
)

final_fields = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we are removing the field from the predicted schema? If time ingestion partitioning is enabled then I can see how _PARTITIONTIME gets added to the predicted schema but I don't see why you would remove the partition_by field from the predicted schema?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming time_ingestion_partitioning is set to true, I thought that there wouldn't be any queries that select the partition_by.field since it essentially won't be a field in the real table. I put it back to the predicted schema.

result = self._replace_partition_with_time_ingestion_column(result)

if not full_refresh:
target_table = self._sql_runner.get_node_schema(node)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the incremental already exists does dbt behave differently? I don't think dbt can modify the partition keys for tables once they have been created without a full refresh

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this line back to the original.

@malikfm malikfm force-pushed the bugfix/unrecognized-partitiontime-for-time-ingestion-partitioned-models branch from a5f6b44 to f78fcb7 Compare July 1, 2024 05:33
@malikfm malikfm force-pushed the bugfix/unrecognized-partitiontime-for-time-ingestion-partitioned-models branch from f78fcb7 to f8867cc Compare July 1, 2024 06:53
@ccharlesgb
Copy link
Collaborator

Seems like the latest python 3.12 version has broken something 🤔 Will look into this

@ccharlesgb ccharlesgb self-requested a review July 16, 2024 13:14
Copy link
Collaborator

@ccharlesgb ccharlesgb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be good to merge once you rebase from master 👍

@ccharlesgb ccharlesgb merged commit 20189bc into autotraderuk:main Jul 16, 2024
4 of 5 checks passed
@malikfm
Copy link
Contributor Author

malikfm commented Jul 16, 2024

Thanks 👍

@ccharlesgb
Copy link
Collaborator

This is release in v0.8.0. I think we still might be missing some edge cases here like if the table already exists but isn't partitioned and then someone adds time_ingestion_partitioning later on then dbt won't update the partitioning of the table but the dry run will think it will. But happy for you to raise more PRs if you see this come up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unrecognized name _PARTITIONTIME for models with time ingestion partitioning
2 participants