From dda5c3a8924da8e1477e40d6bc6d6af869ef21da Mon Sep 17 00:00:00 2001 From: Jeremy Cohen Date: Tue, 14 May 2024 11:15:02 +0200 Subject: [PATCH] Clarifications for Staging envs, 1:1 projects for Mesh (#5453) resolves https://github.com/dbt-labs/docs.getdbt.com/issues/5447 See linked issue for context --------- Co-authored-by: Matt Shaver <60105315+matthewshaver@users.noreply.github.com> --- .../best-practices/how-we-mesh/mesh-4-faqs.md | 8 ++++++ .../govern/project-dependencies.md | 10 +++---- .../docs/docs/deploy/deploy-environments.md | 28 +++++++++++++++++-- 3 files changed, 37 insertions(+), 9 deletions(-) diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md index 9889ebb9f69..476eff1ef90 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -251,6 +251,14 @@ If you’re interested in beta access to “Staging” environments, let your db + + +The short answer is "no." Cross-project references require that each project `name` be unique in your dbt Cloud account. + +Historical limitations required customers to "duplicate" projects so that one actual dbt project (codebase) would map to more than one dbt Cloud project. To that end, we are working to remove the historical limitations that required customers to "duplicate" projects in dbt Cloud — Staging environments for data isolation (beta), environment-level permissions, and environment-level data warehouse connections (coming soon). Once those pieces are in place, it should no longer be necessary to define separate dbt Cloud projects to isolate data environments or permissions. + + + ## Compatibility with other features diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index 8265e953839..1ca9a0e5312 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -32,12 +32,10 @@ Refer to the [FAQs](#faqs) for more info. ## Prerequisites In order to add project dependencies and resolve cross-project `ref`, you must: -- Use dbt v1.6 or higher for **both** the upstream ("producer") project and the downstream ("consumer") project. -- Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). To apply the change, rerun a production job. -- Have a deployment environment in the upstream ("producer") project [that is set to be your production environment](/docs/deploy/deploy-environments#set-as-production-environment) -- Have a successful run of the upstream ("producer") project. -- Define and trigger a job before marking the environment as Staging. Read more about [Staging environments with downstream dependencies](/docs/collaborate/govern/project-dependencies#staging-with-downstream-dependencies). -- Have a multi-tenant or single-tenant [dbt Cloud Enterprise](https://www.getdbt.com/pricing) account (Azure ST is not supported but coming soon.) +- Use a supported version of dbt (v1.6, v1.7, or "Keep on latest version") for both the upstream ("producer") project and the downstream ("consumer") project. +- Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). You need at least one successful job run after defining their `access`. +- Define a deployment environment in the upstream ("producer") project [that is set to be your Production environment](/docs/deploy/deploy-environments#set-as-production-environment), and ensure it has at least one successful job run in that environment. +- Each project `name` must be unique in your dbt Cloud account. For example, if you have a dbt project (codebase) for the `jaffle_marketing` team, you should not create separate projects for `Jaffle Marketing - Dev` and `Jaffle Marketing - Prod`. That isolation should instead be handled at the environment level. To that end, we are working on adding support for environment-level permissions and data warehouse connections; reach out to your dbt Labs account team for beta access in May/June 2024. ## Example diff --git a/website/docs/docs/deploy/deploy-environments.md b/website/docs/docs/deploy/deploy-environments.md index 42f34740164..7d571dd4b9c 100644 --- a/website/docs/docs/deploy/deploy-environments.md +++ b/website/docs/docs/deploy/deploy-environments.md @@ -47,10 +47,16 @@ For Semantic Layer-eligible customers, the next section of environment settings Currently in limited availability beta. Contact support or your account team if you're interested in beta access. ::: -Use a Staging environment to grant developers access to deployment workflows and tools while controlling access to production data. You can do this in a couple of ways, but the most straightforward is to configure Staging with a long-living branch (for example, `staging`) similar to but separate from the primary branch (for example, `main`). +Use a Staging environment to grant developers access to deployment workflows and tools while controlling access to production data. Staging environments enable you to achieve more granular control over permissions, data warehouse connections, and data isolation — within the purview of a single project in dbt Cloud. + +### Git workflow + +You can approach this in a couple of ways, but the most straightforward is configuring Staging with a long-living branch (for example, `staging`) similar to but separate from the primary branch (for example, `main`). In this scenario, the workflows would ideally move upstream from the Development environment -> Staging environment -> Production environment with developer branches feeding into the `staging` branch, then ultimately merging into `main`. In many cases, the `main` and `staging` branches will be identical after a merge and remain until the next batch of changes from the `development` branches are ready to be elevated. We recommend setting branch protection rules on `staging` similar to `main`. +Some customers prefer to connect Development and Staging to their `main` branch and then cut release branches on a regular cadence (daily or weekly), which feeds into Production. + ### Why use a staging environment There are two primary motivations for using a Staging environment: @@ -61,9 +67,25 @@ There are two primary motivations for using a Staging environment: Provide developers with the ability to create, edit, and trigger ad hoc jobs in the Staging environment, while keeping the Production environment locked down. ::: -Let's say you have `Project B` downstream of `Project A` with cross-project refs configured in the models. When developers work in the IDE for `Project B`, cross-project refs will resolve to the Staging environment of `Project A`, rather than production. You'll get the same results with those refs when jobs are run in the Staging environment. Only the Production environment will reference the Production data, keeping the data and access isolated without needing separate projects. +**Conditional configuration of sources** enables you to point to "prod" or "non-prod" source data, depending on the environment you're running in. For example, this source will point to `.sensitive_source.table_with_pii`, where `` is dynamically resolved based on an environment variable. + + + +```yaml +sources: + - name: sensitive_source + database: "{{ env_var('SENSITIVE_SOURCE_DATABASE') }}" + tables: + - name: table_with_pii +``` + + + +There is exactly one source (`sensitive_source`), and all downstream dbt models select from it as `{{ source('sensitive_source', 'table_with_pii') }}`. The code in your project and the shape of the DAG remain consistent across environments. By setting it up in this way, rather than duplicating sources, you get some important benefits. + +**Cross-project references in dbt Mesh:** Let's say you have `Project B` downstream of `Project A` with cross-project refs configured in the models. When developers work in the IDE for `Project B`, cross-project refs will resolve to the Staging environment of `Project A`, rather than production. You'll get the same results with those refs when jobs are run in the Staging environment. Only the Production environment will reference the Production data, keeping the data and access isolated without needing separate projects. -If `Project B` also has a Staging deployment, then references to unbuilt upstream models within `Project B` will resolve to that environment, using [deferral](/docs/cloud/about-cloud-develop-defer), rather than resolving to the models in Production. This saves developers time and warehouse spend, while preserving clear separation of environments. +**Faster development enabled by deferral:** If `Project B` also has a Staging deployment, then references to unbuilt upstream models within `Project B` will resolve to that environment, using [deferral](/docs/cloud/about-cloud-develop-defer), rather than resolving to the models in Production. This saves developers time and warehouse spend, while preserving clear separation of environments. Finally, the Staging environment has its own view in [dbt Explorer](/docs/collaborate/explore-projects), giving you a full view of your prod and pre-prod data.