Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3433] [Feature] Prioritize view creation before table materialization #9175

Closed
3 tasks done
rchui opened this issue Nov 29, 2023 · 1 comment
Closed
3 tasks done
Labels
enhancement New feature or request postgres wontfix Not a bug or out of scope for dbt-core

Comments

@rchui
Copy link

rchui commented Nov 29, 2023

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

When a table is reset it executes a DROP TABLE CASCADE that also drops any downstream views that depend on the table. When the DBT graph is large and concurrency is limited, this can cause periods of time where the table has been reset but the view on top of it does not exist.

Take an example of concurrency of 1 with TableA, ViewA, and TableB, where ViewA references TableA. If the order is

  1. TableA
  2. ViewA
  3. TableB

Then you will not have any issues. If instead the order is

  1. TableA
  2. TableB
  3. ViewA

ViewA is not populated until TableB is completed. If TableB takes a long time to materialize, then ViewA will not be present for that period. This causes problems for downstream consumers that depend on those views existing.

I'd like (either by default or configuration) to favor the creation of views before tables all dependencies being equal.

Describe alternatives you've considered

None that I can think of. I would love if this already existed and just isn't documented clearly somewhere in the DBT docs.

Who will this benefit?

This will benefit users that operate in non data warehouse environments that have constrained concurrency and large numbers of models to evaluate. In general it will reduce the downstream impact on consumers that depend on views in order to query data sources.

Are you interested in contributing this feature?

I can, but wouldn't know where to start.

Anything else?

No response

@rchui rchui added enhancement New feature or request triage labels Nov 29, 2023
@github-actions github-actions bot changed the title [Feature] Prioritize view creation before table materialization [CT-3433] [Feature] Prioritize view creation before table materialization Nov 29, 2023
@dbeatty10 dbeatty10 self-assigned this Dec 6, 2023
@dbeatty10
Copy link
Contributor

Thanks for reporting this @rchui !

Big picture

I think the root cause is that both Postgres and Redshift have early-binding views that are coupled with the underlying datasets they select from.

Redshift provided a work-around for this issue when they introduced late-binding views in late 2017. These were supported in dbt shortly thereafter.

But it sounds like there's still no way to directly support late-binding views in postres.

Blue/green deployments to the rescue?

Using a blue/green deployment using dbt clone wouldn't help in this case because cloning in dbt-postgres relies on views, so we'd just run into the same problem as above.

Our recommendation

  • In Postgres, use tables (rather than views) for any data sets that are exposed to consumers

Summary

Since this is primarily driven by the design choices of Postgres and we don't plan on implementing work-arounds, I'm going to close this as "not planned".

I've opened dbt-labs/docs.getdbt.com#4602 to give clearer guidance about how to avoid this in Postgres.

@dbeatty10 dbeatty10 closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2023
@dbeatty10 dbeatty10 added wontfix Not a bug or out of scope for dbt-core and removed triage labels Dec 6, 2023
@dbeatty10 dbeatty10 removed their assignment Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request postgres wontfix Not a bug or out of scope for dbt-core
Projects
None yet
Development

No branches or pull requests

3 participants