Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New snapshot config to validate uniqueness before merge #10236

Open
Tracked by #10151
graciegoheen opened this issue May 28, 2024 · 1 comment
Open
Tracked by #10151

New snapshot config to validate uniqueness before merge #10236

graciegoheen opened this issue May 28, 2024 · 1 comment
Labels
snapshots Issues related to dbt's snapshot functionality user docs [docs.getdbt.com] Needs better documentation

Comments

@graciegoheen
Copy link
Contributor

graciegoheen commented May 28, 2024

          @kjstultz thanks again for creating this issue! Appreciate that you invested the time at Coalesce in NOLA to figure out how to reproduce this tricky situation.

Reading the discussion between you and @jtcohen6, it sounds like the feature discussed thus far would look like this:

  • Add a new snapshot config parameter validate_uniqueness (default to false)
  • When validate_uniqueness is true, execute a uniqueness test after creating the staging table and before running the merge

Although this proposed feature wouldn't fix a snapshot table that already has duplicates, it would prevent duplicates from being added in the first place! Feels like a win to me, especially with it being configurable for those that (somehow) know that their "unique" key is truly unique.

Originally posted by @dbeatty10 in #6089 (comment)


Notes:

  • Would be slow in some cases, so want to make sure this is configurable
  • But valuable to make sure snapshot is accurate
@graciegoheen graciegoheen added snapshots Issues related to dbt's snapshot functionality user docs [docs.getdbt.com] Needs better documentation labels May 28, 2024
@graciegoheen
Copy link
Contributor Author

This is similar to running uniqueness test before insert for microbatch -> #10624

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
snapshots Issues related to dbt's snapshot functionality user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
Development

No branches or pull requests

1 participant