diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md index 309872dd818..0534dd916cb 100644 --- a/.github/pull_request_template.md +++ b/.github/pull_request_template.md @@ -1,6 +1,6 @@ ## What are you changing in this pull request and why? @@ -16,11 +16,8 @@ Uncomment if you're publishing docs for a prerelease version of dbt (delete if n - [ ] For [docs versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#about-versioning), review how to [version a whole page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) and [version a block of content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content). - [ ] Add a checklist item for anything that needs to happen before this PR is merged, such as "needs technical review" or "change base branch." -Adding new pages (delete if not applicable): -- [ ] Add page to `website/sidebars.js` -- [ ] Provide a unique filename for the new page - -Removing or renaming existing pages (delete if not applicable): -- [ ] Remove page from `website/sidebars.js` -- [ ] Add an entry `website/static/_redirects` -- [ ] Run link testing locally with `npm run build` to update the links that point to the deleted page +Adding or removing pages (delete if not applicable): +- [ ] Add/remove page in `website/sidebars.js` +- [ ] Provide a unique filename for new pages +- [ ] Add an entry for deleted pages in `website/static/_redirects` +- [ ] Run link testing locally with `npm run build` to update the links that point to deleted pages diff --git a/contributing/content-style-guide.md b/contributing/content-style-guide.md index 4ebbf83bf5f..58f5ba2b21c 100644 --- a/contributing/content-style-guide.md +++ b/contributing/content-style-guide.md @@ -519,6 +519,7 @@ enter (in the command line) | type (in the command line) email | e-mail on dbt | on a remote server person, human | client, customer +plan(s), account | organization, customer press (a key) | hit, tap recommended limit | soft limit sign in | log in, login @@ -529,6 +530,15 @@ dbt Cloud CLI | CLI, dbt CLI dbt Core | CLI, dbt CLI +Note, let's make sure we're talking to our readers and keep them close to the content and documentation (second person). + +For example, to explain that a feature is available on a particular dbt Cloud plan, you can use: +- “XYZ is available on Enterprise plans” +- “If you're on an Enterprise plan, you can access XYZ..” +- "Enterprise plans can access XYZ..." to keep users closer to the documentation. + +This will signal users to check their plan or account status independently. + ## Links Links embedded in the documentation are about trust. Users trust that we will lead them to sites or pages related to their reading content. In order to maintain that trust, it's important that links are transparent, up-to-date, and lead to legitimate resources. diff --git a/contributing/developer-blog.md b/contributing/developer-blog.md index aa9d5b33131..0d9b3becba2 100644 --- a/contributing/developer-blog.md +++ b/contributing/developer-blog.md @@ -6,7 +6,7 @@ The dbt Developer Blog is a place where analytics practitioners can go to share their knowledge with the community. Analytics Engineering is a discipline we’re all building together. The developer blog exists to cultivate the collective knowledge that exists on how to build and scale effective data teams. -We currently have editorial capacity for 10 Community contributed developer blogs per quarter - if we are oversubscribed we suggest you post on another platform or hold off until the editorial team is ready to take on more posts. +We currently have editorial capacity for a few Community contributed developer blogs per quarter - if we are oversubscribed we suggest you post on another platform or hold off until the editorial team is ready to take on more posts. ### What makes a good developer blog post? diff --git a/website/blog/2023-04-18-building-a-kimball-dimensional-model-with-dbt.md b/website/blog/2023-04-18-building-a-kimball-dimensional-model-with-dbt.md index ab364749eff..0aac3d77d53 100644 --- a/website/blog/2023-04-18-building-a-kimball-dimensional-model-with-dbt.md +++ b/website/blog/2023-04-18-building-a-kimball-dimensional-model-with-dbt.md @@ -185,7 +185,7 @@ Now that you’ve set up the dbt project, database, and have taken a peek at the Identifying the business process is done in collaboration with the business user. The business user has context around the business objectives and business processes, and can provide you with that information. - + Upon speaking with the CEO of AdventureWorks, you learn the following information: diff --git a/website/blog/2023-08-01-announcing-materialized-views.md b/website/blog/2023-08-01-announcing-materialized-views.md index eb9716e73a5..3b801c7c719 100644 --- a/website/blog/2023-08-01-announcing-materialized-views.md +++ b/website/blog/2023-08-01-announcing-materialized-views.md @@ -21,7 +21,7 @@ and updates on how to test MVs. The year was 2020. I was a kitten-only household, and dbt Labs was still Fishtown Analytics. A enterprise customer I was working with, Jetblue, asked me for help running their dbt models every 2 minutes to meet a 5 minute SLA. -After getting over the initial terror, we talked through the use case and soon realized there was a better option. Together with my team, I created [lambda views](https://discourse.getdbt.com/t/how-to-create-near-real-time-models-with-just-dbt-sql/1457%20?) to meet the need. +After getting over the initial terror, we talked through the use case and soon realized there was a better option. Together with my team, I created [lambda views](https://discourse.getdbt.com/t/how-to-create-near-real-time-models-with-just-dbt-sql/1457) to meet the need. Flash forward to 2023. I’m writing this as my giant dog snores next to me (don’t worry the cats have multiplied as well). Jetblue has outgrown lambda views due to performance constraints (a view can only be so performant) and we are at another milestone in dbt’s journey to support streaming. What. a. time. @@ -32,8 +32,8 @@ Today we are announcing that we now support Materialized Views in dbt. So, what Materialized views are now an out of the box materialization in your dbt project once you upgrade to the latest version of dbt v1.6 on these following adapters: - [dbt-postgres](/reference/resource-configs/postgres-configs#materialized-views) -- [dbt-redshift](reference/resource-configs/redshift-configs#materialized-views) -- [dbt-snowflake](reference/resource-configs/snowflake-configs#dynamic-tables) +- [dbt-redshift](/reference/resource-configs/redshift-configs#materialized-views) +- [dbt-snowflake](/reference/resource-configs/snowflake-configs#dynamic-tables) - [dbt-databricks](/reference/resource-configs/databricks-configs#materialized-views-and-streaming-tables) - [dbt-materialize*](/reference/resource-configs/materialize-configs#incremental-models-materialized-views) - [dbt-trino*](/reference/resource-configs/trino-configs#materialized-view) @@ -227,4 +227,4 @@ Depending on how you orchestrate your materialized views, you can either run the ## Conclusion -Well, I’m excited for everyone to remove the lines in your packages.yml that installed your experimental package (at least if you’re using it for MVs) and start to get your hands dirty. We are still new in our journey and I look forward to hearing all the things you are creating and how we can better our best practices in this. \ No newline at end of file +Well, I’m excited for everyone to remove the lines in your packages.yml that installed your experimental package (at least if you’re using it for MVs) and start to get your hands dirty. We are still new in our journey and I look forward to hearing all the things you are creating and how we can better our best practices in this. diff --git a/website/blog/2023-12-15-serverless-free-tier-data-stack-with-dlt-and-dbt-core.md b/website/blog/2023-12-15-serverless-free-tier-data-stack-with-dlt-and-dbt-core.md new file mode 100644 index 00000000000..d2c6652d883 --- /dev/null +++ b/website/blog/2023-12-15-serverless-free-tier-data-stack-with-dlt-and-dbt-core.md @@ -0,0 +1,160 @@ +--- +title: Serverless, free-tier data stack with dlt + dbt core. +description: "In this article, Euan shares his personal project to fetch property price data during his and his partner's house-hunting process, and how he created a serverless free-tier data stack by using Google Cloud Functions to run data ingestion tool dlt alongside dbt for transformation." +slug: serverless-dlt-dbt-stack + +authors: [euan_johnston] + +hide_table_of_contents: false + +date: 2024-01-15 +is_featured: false +--- + + + +## The problem, the builder and tooling + +**The problem**: My partner and I are considering buying a property in Portugal. There is no reference data for the real estate market here - how many houses are being sold, for what price? Nobody knows except the property office and maybe the banks, and they don’t readily divulge this information. The only data source we have is Idealista, which is a portal where real estate agencies post ads. + +Unfortunately, there are significantly fewer properties than ads - it seems many real estate companies re-post the same ad that others do, with intentionally different data and often misleading bits of info. The real estate agencies do this so the interested parties reach out to them for clarification, and from there they can start a sales process. At the same time, the website with the ads is incentivised to allow this to continue as they get paid per ad, not per property. + +**The builder:** I’m a data freelancer who deploys end to end solutions, so when I have a data problem, I cannot just let it go. + +**The tools:** I want to be able to run my project on [Google Cloud Functions](https://cloud.google.com/functions) due to the generous free tier. [dlt](https://dlthub.com/) is a new Python library for declarative data ingestion which I have wanted to test for some time. Finally, I will use dbt Core for transformation. + +## The starting point + +If I want to have reliable information on the state of the market I will need to: + +- Grab the messy data from Idealista and historize it. +- Deduplicate existing listings. +- Try to infer what listings sold for how much. + +Once I have deduplicated listings with some online history, I can get an idea: + +- How expensive which properties are. +- How fast they get sold, hopefully a signal of whether they are “worth it” or not. + +## Towards a solution + +The solution has pretty standard components: + +- An EtL pipeline. The little t stands for normalisation, such as transforming strings to dates or unpacking nested structures. This is handled by dlt functions written in Python. +- A transformation layer taking the source data loaded by my dlt functions and creating the tables necessary, handled by dbt. +- Due to the complexity of deduplication, I needed to add a human element to confirm the deduplication in Google Sheets. + +These elements are reflected in the diagram below and further clarified in greater detail later in the article: + + + +### Ingesting the data + +For ingestion, I use a couple of sources: + +First, I ingest home listings from the Idealista API, accessed through [API Dojo's freemium wrapper](https://rapidapi.com/apidojo/api/idealista2). The dlt pipeline I created for ingestion is in [this repo](https://github.com/euanjohnston-dev/Idealista_pipeline). + +After an initial round of transformation (described in the next section), the deduplicated data is loaded into BigQuery where I can query it from the Google Sheets client and manually review the deduplication. + +When I'm happy with the results, I use the [ready-made dlt Sheets source connector](https://dlthub.com/docs/dlt-ecosystem/verified-sources/google_sheets) to pull the data back into BigQuery, [as defined here](https://github.com/euanjohnston-dev/gsheets_check_pipeline). + +### Transforming the data + +For transforming I use my favorite solution, dbt Core. For running and orchestrating dbt on Cloud Functions, I am using dlt’s dbt Core runner. The benefit of the runner in this context is that I can re-use the same credential setup, instead of creating a separate profiles.yml file. + +This is the package I created: + +### Production-readying the pipeline + +To make the pipeline more “production ready”, I made some improvements: + +- Using a credential store instead of hard-coding passwords, in this case Google Secret Manager. +- Be notified when the pipeline runs and what the outcome is. For this I sent data to Slack via a dlt decorator that posts the error on failure and the metadata on success. + +```python +from dlt.common.runtime.slack import send_slack_message + +def notify_on_completion(hook): + def decorator(func): + def wrapper(*args, **kwargs): + try: + load_info = func(*args, **kwargs) + message = f"Function {func.__name__} completed successfully. Load info: {load_info}" + send_slack_message(hook, message) + return load_info + except Exception as e: + message = f"Function {func.__name__} failed. Error: {str(e)}" + send_slack_message(hook, message) + raise + return wrapper + return decorator +``` + +## The outcome + +The outcome was first and foremost a visualisation highlighting the unique properties available in my specific area of search. The map shown on the left of the page gives a live overview of location, number of duplicates (bubble size) and price (bubble colour) which can amongst other features be filtered using the sliders on the right. This represents a much better decluttered solution from which to observe the actual inventory available. + + + +Further charts highlight additional metrics which – now that deduplication is complete – can be accurately measured including most importantly, the development over time of “average price/square metre” and those properties which have been inferred to have been sold. + +### Next steps + +This version was very much about getting a base from which to analyze the properties for my own personal use case. + +In terms of further development which could take place, I have had interest from people to run the solution on their own specific target area. + +For this to work at scale I would need a more robust method to deal with duplicate attribution, which is a difficult problem as real estate agencies intentionally change details like number of rooms or surface area. + +Perhaps this is a problem ML or GPT could solve equally well as a human, given the limited options available. + +## Learnings and conclusion + +The data problem itself was an eye opener into the real-estate market. It’s a messy market full of unknowns and noise, which adds a significant purchase risk to first time buyers. + +Tooling wise, it was surprising how quick it was to set everything up. dlt integrates well with dbt and enables fast and simple data ingestion, making this project simpler than I thought it would be. + +### dlt + +Good: + +- As a big fan of dbt I love how seamlessly the two solutions complement one another. dlt handles the data cleaning and normalisation automatically so I can focus on curating and modelling it in dbt. While the automatic unpacking leaves some small adjustments for the analytics engineer, it’s much better than cleaning and typing json in the database or in custom python code. +- When creating my first dummy pipeline I used duckdb. It felt like a great introduction into how simple it is to get started and provided a solid starting block before developing something for the cloud. + +Bad: + +- I did have a small hiccup with the google sheets connector assuming an oauth authentication over my desired sdk but this was relatively easy to rectify. (explicitly stating GcpServiceAccountCredentials in the init.py file for the source). +- Using both a verified source in the gsheets connector and building my own from Rapid API endpoints seemed equally intuitive. However I would have wanted more documentation on how to run these 2 pipelines in the same script with the dbt pipeline. + +### dbt + +No surprises there. I developed the project locally, and to deploy to cloud functions I injected credentials to dbt via the dlt runner. This meant I could re-use the setup I did for the other dlt pipelines. + +```python +def dbt_run(): + # make an authenticated connection with dlt to the dwh + pipeline = dlt.pipeline( + pipeline_name='dbt_pipeline', + destination='bigquery', # credentials read from env + dataset_name='dbt' + ) + # make a venv in case we have lib conflicts between dlt and current env + venv = dlt.dbt.get_venv(pipeline) + # package the pipeline, dbt package and env + dbt = dlt.dbt.package(pipeline, "dbt/property_analytics", venv=venv) + # and run it + models = dbt.run_all() + # show outcome + for m in models: + print(f"Model {m.model_name} materialized in {m.time} with status {m.status} and message {m.message}" +``` + +### Cloud functions + +While I had used cloud functions before, I had never previously set them up for dbt and I was able to easily follow dlt’s docs to run the pipelines there. Cloud functions is a great solution to cheaply run small scale pipelines and my running cost of the project is a few cents a month. If the insights drawn from the project help us save even 1% of a house price, the project will have been a success. + +### To sum up + +dlt feels like the perfect solution for anyone who has scratched the surface of python development. To be able to have schemas ready for transformation in such a short space of time is truly… transformational. As a freelancer, being able to accelerate the development of pipelines is a huge benefit within companies who are often frustrated with the amount of time it takes to start ‘showing value’. + +I’d welcome the chance to discuss what’s been built to date or collaborate on any potential further development in the comments below. diff --git a/website/blog/2023-12-20-partner-integration-guide.md b/website/blog/2023-12-20-partner-integration-guide.md index b546f258f6c..432ed97635b 100644 --- a/website/blog/2023-12-20-partner-integration-guide.md +++ b/website/blog/2023-12-20-partner-integration-guide.md @@ -20,7 +20,7 @@ This guide doesn't include how to integrate with dbt Core. If you’re intereste Instead, we're going to focus on integrating with dbt Cloud. Integrating with dbt Cloud is a key requirement to become a dbt Labs technology partner, opening the door to a variety of collaborative commercial opportunities. Here I'll cover how to get started, potential use cases you want to solve for, and points of integrations to do so. - + ## New to dbt Cloud? If you're new to dbt and dbt Cloud, we recommend you and your software developers try our [Getting Started Quickstarts](https://docs.getdbt.com/guides) after reading [What is dbt](https://docs.getdbt.com/docs/introduction). The documentation will help you familiarize yourself with how our users interact with dbt. By going through this, you will also create a sample dbt project to test your integration. diff --git a/website/blog/2024-01-09-defer-in-development.md b/website/blog/2024-01-09-defer-in-development.md new file mode 100644 index 00000000000..96e2ed53f85 --- /dev/null +++ b/website/blog/2024-01-09-defer-in-development.md @@ -0,0 +1,160 @@ +--- +title: "More time coding, less time waiting: Mastering defer in dbt" +description: "Learn how to take advantage of the defer to prod feature in dbt Cloud" +slug: defer-to-prod + +authors: [dave_connors] + +tags: [analytics craft] +hide_table_of_contents: false + +date: 2024-01-09 +is_featured: true +--- + +Picture this — you’ve got a massive dbt project, thousands of models chugging along, creating actionable insights for your stakeholders. A ticket comes your way — a model needs to be refactored! "No problem," you think to yourself, "I will simply make that change and test it locally!" You look at your lineage, and realize this model is many layers deep, buried underneath a long chain of tables and views. + +“OK,” you think further, “I’ll just run a `dbt build -s +my_changed_model` to make sure I have everything I need built into my dev schema and I can test my changes”. You run the command. You wait. You wait some more. You get some coffee, and completely take yourself out of your dbt development flow state. A lot of time and money down the drain to get to a point where you can *start* your work. That’s no good! + +Luckily, dbt’s defer functionality allow you to *only* build what you care about when you need it, and nothing more. This feature helps developers spend less time and money in development, helping ship trusted data products faster. dbt Cloud offers native support for this workflow in development, so you can start deferring without any additional overhead! + +## Defer to prod or prefer to slog + +A lot of dbt’s magic relies on the elegance and simplicity of the `{{ ref() }}` function, which is how you can build your lineage graph, and how dbt can be run in different environments — the `{{ ref() }}` functions dynamically compile depending on your environment settings, so that you can run your project in development and production without changing any code. + +Here's how a simple `{{ ref() }}` would compile in different environments: + + + + + + ```sql + -- in models/my_model.sql + select * from {{ ref('model_a') }} + ``` + + + + + ```sql + -- in target/compiled/models/my_model.sql + select * from analytics.dbt_dconnors.model_a + ``` + + + + + ```sql + -- in target/compiled/models/my_model.sql + select * from analytics.analytics.model_a + ``` + + + + +All of that is made possible by the dbt `manifest.json`, [the artifact](https://docs.getdbt.com/reference/artifacts/manifest-json) that is produced each time you run a dbt command, containing the comprehensive and encyclopedic compendium of all things in your project. Each node is assigned a `unique_id` (like `model.my_project.my_model` ) and the manifest stores all the metadata about that model in a dictionary associated to that id. This includes the data warehouse location that gets returned when you write `{{ ref('my_model') }}` in SQL. Different runs of your project in different environments result in different metadata written to the manifest. + +Let’s think back to the hypothetical above — what if we made use of the production metadata to read in data from production, so that I don’t have to rebuild *everything* upstream of the model I’m changing? That’s exactly what `defer` does! When you supply dbt with a production version of the `manifest.json` artifact, and pass the `--defer` flag to your dbt command, dbt will resolve the `{{ ref() }}` functions for any resource upstream of your selected models with the *production metadata* — no need to rebuild anything you don’t have to! + +Let’s take a look at a simplified example — let’s say your project looks like this in production: + + + +And you’re tasked with making changes to `model_f`. Without defer, you would need to make sure to at minimum execute a `dbt run -s +model_f` to ensure all the upstream dependencies of `model_f` are present in your development schema so that you can start to run `model_f`.* You just spent a whole bunch of time and money duplicating your models, and now your warehouse looks like this: + + + +With defer, we should not build anything other than the models that have changed, and are now different from their production counterparts! Let’s tell dbt to use production metadata to resolve our refs, and only build the model I have changed — that command would be `dbt run -s model_f --defer` .** + + + +This results in a *much slimmer build* — we read data in directly from the production version of `model_b` and `model_c`, and don’t have to worry about building anything other than what we selected! + +\* [Another option](https://docs.getdbt.com/reference/commands/clone) is to run `dbt clone -s +model_f` , which will make clones of your production models into your development schema, making use of zero copy cloning where available. Check out this [great dev blog](https://docs.getdbt.com/blog/to-defer-or-to-clone) from Doug and Kshitij on when to use `clone` vs `defer`! + +** in dbt Core, you also have to tell dbt where to find the production artifacts! Otherwise it doesn’t know what to defer to. You can either use the `--state path/to/artifact/folder` option, or set a `DBT_STATE` environment variable. + +### Batteries included deferral in dbt Cloud + +dbt Cloud offers a seamless deferral experience in both the dbt Cloud IDE and the dbt Cloud CLI — dbt Cloud ***always*** has the latest run artifacts from your production environment. Rather than having to go through the painful process of somehow getting a copy of your latest production `manifest.json` into your local filesystem to defer to, and building a pipeline to always keep it fresh, dbt Cloud does all that work for you. When developing in dbt Cloud, the latest artifact is automatically provided to you under the hood, and dbt Cloud handles the `--defer` flag for you when you run commands in “defer mode”. dbt Cloud will use the artifacts from the deployment environment in your project marked as `Production` in the [environments settings](https://docs.getdbt.com/docs/deploy/deploy-environments#set-as-production-environment) in both the IDE and the Cloud CLI. Be sure to configure a production environment to unlock this feature! + +In the dbt Cloud IDE, there’s as simple toggle switch labeled `Defer to production`. Simply enabling this toggle will defer your command to the production environment when you run any dbt command in the IDE! + + + +The cloud CLI has this setting *on by default* — there’s nothing else you need to do to set this up! If you prefer not to defer, you can pass the `--no-defer` flag to override this behavior. You can also set an environment other than your production environment as the deferred to environment in your `dbt-cloud` settings in your `dbt_project.yml` : + +```yaml +dbt-cloud: + project-id: + defer-env-id: +``` + +When you’re developing with dbt Cloud, you can defer right away, and completely avoid unnecessary model builds in development! + +### Other things to to know about defer + +**Favoring state** + +One of the major gotchas in the defer workflow is that when you’re in defer mode, dbt assumes that all the objects in your development schema are part of your current work stream, and will prioritize those objects over the production objects when possible. + +Let’s take a look at that example above again, and pretend that some time before we went to make this edit, we did some work on `model_c`, and we have a local copy of `model_c` hanging out in our development schema: + + + +When you run `dbt run -s model_f --defer` , dbt will detect the development copy of `model_c` and say “Hey, y’know, I bet Dave is working on that model too, and he probably wants to make sure his changes to `model_c` work together with his changes to `model_f` . Because I am a kind and benevolent data transformation tool, i’ll make sure his `{{ ref('model_c') }]` function compiles to his development changes!” Thanks dbt! + +As a result, we’ll effectively see this behavior when we run our command: + + + +Where our code would compile from + +```sql +# in models/model_f.sql +with + +model_b as ( + select * from {{ ref('model_b') }} +), + +model_c as ( + select * from {{ ref('model_c') }} +), + +... +``` + +to + +```sql +# in target/compiled/models/model_f.sql +with + +model_b as ( + select * from analytics.analytics.model_b +), + +model_c as ( + select * from analytics.dbt_dconnors.model_b +), + +... +``` + +A mix of prod and dev models may not be what we want! To avoid this, we have a couple options: + +1. **Start fresh every time:** The simplest way to avoid this issue is to make sure you are always drop your development schema at the start of a new development session. That way, the only things that show up in your development schema are the things you intentionally selected with your commands! +2. **Favor state:** Passing the `--favor-state` flag to your command tells dbt “Hey benevolent tool, go ahead and use what you find in the production manifest no matter what you find in my development schema” so that both `{{ ref() }}` functions in the example above point to the production schema, even if `model_c` was hanging around in there. + +In this example, `model_c` is a relic of a previous development cycle, but I should be clear here that defaulting to using dev relations is *usually the right course of action* — generally, a dbt PR spans a few models, and you want to coordinate your changes across those models together. This behavior can just get a bit confusing if you’re encountering it for the first time! + +**When should I *not* defer to prod** + +While defer is a faster and cheaper option for most folks in most situations, defer to prod does not support all projects. The most common reason you should not use defer is regulatory — defer to prod makes the assumption that data is shared between your production and development environments, so reading between these environments is not an issue. For some organizations, like healthcare companies, have restrictions around the data access and sharing that precludes the basic defer structure presented here. + +### Call me Willem Defer + + + +Defer to prod is a powerful way to improve your development velocity with dbt, and dbt Cloud makes it easier than ever to make use of this feature! You too could look this cool while you’re saving time and money developing on your dbt projects! diff --git a/website/blog/authors.yml b/website/blog/authors.yml index 82cc300bdc8..4aa33773988 100644 --- a/website/blog/authors.yml +++ b/website/blog/authors.yml @@ -1,6 +1,6 @@ amy_chen: image_url: /img/blog/authors/achen.png - job_title: Product Partnerships Manager + job_title: Product Ecosystem Manager links: - icon: fa-linkedin url: https://www.linkedin.com/in/yuanamychen/ @@ -187,6 +187,16 @@ emily_riederer: - icon: fa-readme url: https://emilyriederer.com +euan_johnston: + image_url: /img/blog/authors/ejohnston.png + job_title: Freelance Business Intelligence manager + name: Euan Johnston + links: + - icon: fa-linkedin + url: https://www.linkedin.com/in/euan-johnston-610a05a8/ + - icon: fa-github + url: https://github.com/euanjohnston-dev + grace_goheen: image_url: /img/blog/authors/grace-goheen.jpeg job_title: Analytics Engineer diff --git a/website/docs/best-practices/best-practice-workflows.md b/website/docs/best-practices/best-practice-workflows.md index 9b79c244901..4381906361e 100644 --- a/website/docs/best-practices/best-practice-workflows.md +++ b/website/docs/best-practices/best-practice-workflows.md @@ -39,7 +39,7 @@ Your dbt project will depend on raw data stored in your database. Since this dat :::info Using sources for raw data references -As of v0.13.0, we recommend defining your raw data as [sources](/docs/build/sources), and selecting from the source rather than using the direct relation reference. Our dbt projects no longer contain any direct relation references in any models. +We recommend defining your raw data as [sources](/docs/build/sources), and selecting from the source rather than using the direct relation reference. Our dbt projects don't contain any direct relation references in any models. ::: diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md index ee3d4262882..e50542a446c 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md @@ -2,6 +2,8 @@ title: "Intro to MetricFlow" description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow +pagination_next: "best-practices/how-we-build-our-metrics/semantic-layer-2-setup" +pagination_prev: null --- Flying cars, hoverboards, and true self-service analytics: this is the future we were promised. The first two might still be a few years out, but real self-service analytics is here today. With dbt Cloud's Semantic Layer, you can resolve the tension between accuracy and flexibility that has hampered analytics tools for years, empowering everybody in your organization to explore a shared reality of metrics. Best of all for analytics engineers, building with these new tools will significantly [DRY](https://docs.getdbt.com/terms/dry) up and simplify your codebase. As you'll see, the deep interaction between your dbt models and the Semantic Layer make your dbt project the ideal place to craft your metrics. diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-2-setup.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-2-setup.md index 6e9153a3780..470445891dc 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-2-setup.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-2-setup.md @@ -2,6 +2,7 @@ title: "Set up MetricFlow" description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow +pagination_next: "best-practices/how-we-build-our-metrics/semantic-layer-3-build-semantic-models" --- ## Getting started @@ -13,9 +14,23 @@ git clone git@github.com:dbt-labs/jaffle-sl-template.git cd path/to/project ``` -Next, before you start writing code, you need to install MetricFlow as an extension of a dbt adapter from PyPI (dbt Core users only). The MetricFlow is compatible with Python versions 3.8 through 3.11. +Next, before you start writing code, you need to install MetricFlow: -We'll use pip to install MetricFlow and our dbt adapter: + + + + +- [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) — MetricFlow commands are embedded in the dbt Cloud CLI. You can immediately run them once you install the dbt Cloud CLI. Using dbt Cloud means you won't need to manage versioning — your dbt Cloud account will automatically manage the versioning. + +- [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) — You can create metrics using MetricFlow in the dbt Cloud IDE. However, support for running MetricFlow commands in the IDE will be available soon. + + + + + +- Download MetricFlow as an extension of a dbt adapter from PyPI (dbt Core users only). The MetricFlow is compatible with Python versions 3.8 through 3.11. + - **Note**: You'll need to manage versioning between dbt Core, your adapter, and MetricFlow. +- We'll use pip to install MetricFlow and our dbt adapter: ```shell # activate a virtual environment for your project, @@ -27,13 +42,16 @@ python -m pip install "dbt-metricflow[adapter name]" # e.g. python -m pip install "dbt-metricflow[snowflake]" ``` -Lastly, to get to the pre-Semantic Layer starting state, checkout the `start-here` branch. + + + +- Now that you're ready to use MetricFlow, get to the pre-Semantic Layer starting state by checking out the `start-here` branch: ```shell git checkout start-here ``` -For more information, refer to the [MetricFlow commands](/docs/build/metricflow-commands) or a [quickstart](/guides) to get more familiar with setting up a dbt project. +For more information, refer to the [MetricFlow commands](/docs/build/metricflow-commands) or the [quickstart guides](/guides) to get more familiar with setting up a dbt project. ## Basic commands diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-3-build-semantic-models.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-3-build-semantic-models.md index a2dc55e37ae..9c710b286ef 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-3-build-semantic-models.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-3-build-semantic-models.md @@ -2,6 +2,7 @@ title: "Building semantic models" description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow +pagination_next: "best-practices/how-we-build-our-metrics/semantic-layer-4-build-metrics" --- ## How to build a semantic model diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-4-build-metrics.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-4-build-metrics.md index da83adbdc69..003eff9de40 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-4-build-metrics.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-4-build-metrics.md @@ -2,6 +2,7 @@ title: "Building metrics" description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow +pagination_next: "best-practices/how-we-build-our-metrics/semantic-layer-5-refactor-a-mart" --- ## How to build metrics diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-5-refactor-a-mart.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-5-refactor-a-mart.md index dfdba2941e9..9ae80cbcd29 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-5-refactor-a-mart.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-5-refactor-a-mart.md @@ -2,6 +2,7 @@ title: "Refactor an existing mart" description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow +pagination_next: "best-practices/how-we-build-our-metrics/semantic-layer-6-advanced-metrics" --- ## A new approach diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-6-advanced-metrics.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-6-advanced-metrics.md index fe7438b5800..e5c6e452dac 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-6-advanced-metrics.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-6-advanced-metrics.md @@ -2,6 +2,7 @@ title: "More advanced metrics" description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow +pagination_next: "best-practices/how-we-build-our-metrics/semantic-layer-7-conclusion" --- ## More advanced metric types diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-7-conclusion.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-7-conclusion.md index a1062721177..1870b6b77e4 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-7-conclusion.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-7-conclusion.md @@ -2,6 +2,7 @@ title: "Best practices" description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow +pagination_next: null --- ## Putting it all together diff --git a/website/docs/best-practices/how-we-mesh/mesh-1-intro.md b/website/docs/best-practices/how-we-mesh/mesh-1-intro.md index ba1660a8d82..819a9e04111 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-1-intro.md +++ b/website/docs/best-practices/how-we-mesh/mesh-1-intro.md @@ -6,19 +6,21 @@ hoverSnippet: Learn how to get started with dbt Mesh ## What is dbt Mesh? -Organizations of all sizes rely upon dbt to manage their data transformations, from small startups to large enterprises. At scale, it can be challenging to coordinate all the organizational and technical requirements demanded by your stakeholders within the scope of a single dbt project. To date, there also hasn't been a first-class way to effectively manage the dependencies, governance, and workflows between multiple dbt projects. +Organizations of all sizes rely upon dbt to manage their data transformations, from small startups to large enterprises. At scale, it can be challenging to coordinate all the organizational and technical requirements demanded by your stakeholders within the scope of a single dbt project. -Regardless of your organization's size and complexity, dbt should empower data teams to work independently and collaboratively; sharing data, code, and best practices without sacrificing security or autonomy. dbt Mesh provides the tooling for teams to finally achieve this. +To date, there also hasn't been a first-class way to effectively manage the dependencies, governance, and workflows between multiple dbt projects. -dbt Mesh is not a single product: it is a pattern enabled by a convergence of several features in dbt: +That's where **dbt Mesh** comes in - empowering data teams to work *independently and collaboratively*; sharing data, code, and best practices without sacrificing security or autonomy. -- **[Cross-project references](/docs/collaborate/govern/project-dependencies#how-to-use-ref)** - this is the foundational feature that enables the multi-project deployments. `{{ ref() }}`s now work across dbt Cloud projects on Enterprise plans. +This guide will walk you through the concepts and implementation details needed to get started. dbt Mesh is not a single product - it is a pattern enabled by a convergence of several features in dbt: + +- **[Cross-project references](/docs/collaborate/govern/project-dependencies#how-to-write-cross-project-ref)** - this is the foundational feature that enables the multi-project deployments. `{{ ref() }}`s now work across dbt Cloud projects on Enterprise plans. - **[dbt Explorer](/docs/collaborate/explore-projects)** - dbt Cloud's metadata-powered documentation platform, complete with full, cross-project lineage. -- **Governance** - dbt's new governance features allow you to manage access to your dbt models both within and across projects. - - **[Groups](/docs/collaborate/govern/model-access#groups)** - groups allow you to assign models to subsets within a project. +- **Governance** - dbt's governance features allow you to manage access to your dbt models both within and across projects. + - **[Groups](/docs/collaborate/govern/model-access#groups)** - With groups, you can organize nodes in your dbt DAG that share a logical connection (for example, by functional area) and assign an owner to the entire group. - **[Access](/docs/collaborate/govern/model-access#access-modifiers)** - access configs allow you to control who can reference models. -- **[Model Versions](/docs/collaborate/govern/model-versions)** - when coordinating across projects and teams, we recommend treating your data models as stable APIs. Model versioning is the mechanism to allow graceful adoption and deprecation of models as they evolve. -- **[Model Contracts](/docs/collaborate/govern/model-contracts)** - data contracts set explicit expectations on the shape of the data to ensure data changes upstream of dbt or within a project's logic don't break downstream consumers' data products. + - **[Model Versions](/docs/collaborate/govern/model-versions)** - when coordinating across projects and teams, we recommend treating your data models as stable APIs. Model versioning is the mechanism to allow graceful adoption and deprecation of models as they evolve. + - **[Model Contracts](/docs/collaborate/govern/model-contracts)** - data contracts set explicit expectations on the shape of the data to ensure data changes upstream of dbt or within a project's logic don't break downstream consumers' data products. ## Who is dbt Mesh for? @@ -32,6 +34,8 @@ dbt Cloud is designed to coordinate the features above and simplify the complexi If you're just starting your dbt journey, don't worry about building a multi-project architecture right away. You can _incrementally_ adopt the features in this guide as you scale. The collection of features work effectively as independent tools. Familiarizing yourself with the tooling and features that make up a multi-project architecture, and how they can apply to your organization will help you make better decisions as you grow. +For additional information, refer to the [dbt Mesh FAQs](/best-practices/how-we-mesh/mesh-4-faqs). + ## Learning goals - Understand the **purpose and tradeoffs** of building a multi-project architecture. diff --git a/website/docs/best-practices/how-we-mesh/mesh-2-structures.md b/website/docs/best-practices/how-we-mesh/mesh-2-structures.md index 9ab633c50ad..345ef22c62d 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-2-structures.md +++ b/website/docs/best-practices/how-we-mesh/mesh-2-structures.md @@ -20,7 +20,7 @@ At a high level, you’ll need to decide: ### Cycle detection -Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops), which lead to issues with your data workflows. For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-use-ref) for more information. +Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops), which lead to issues with your data workflows. For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-write-cross-project-ref) for more information. ## Define your project interfaces by splitting your DAG diff --git a/website/docs/best-practices/how-we-mesh/mesh-3-implementation.md b/website/docs/best-practices/how-we-mesh/mesh-3-implementation.md index 65ed5d7935b..5934c1625a3 100644 --- a/website/docs/best-practices/how-we-mesh/mesh-3-implementation.md +++ b/website/docs/best-practices/how-we-mesh/mesh-3-implementation.md @@ -127,4 +127,4 @@ We've provided a set of example projects you can use to explore the topics cover ### dbt-meshify -We recommend using the `dbt-meshify` [command line tool]() to help you do this. This comes with CLI operations to automate most of the above steps. +We recommend using the `dbt-meshify` [command line tool]() to help you do this. This comes with CLI operations to automate most of the above steps. diff --git a/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md new file mode 100644 index 00000000000..2b11c3563eb --- /dev/null +++ b/website/docs/best-practices/how-we-mesh/mesh-4-faqs.md @@ -0,0 +1,317 @@ +--- +title: "dbt Mesh FAQs" +description: "Read the FAQs to learn more about dbt Mesh, how it works, compatibility, and more." +hoverSnippet: "dbt Mesh FAQs" +sidebar_label: "dbt Mesh FAQs" +--- + +dbt Mesh is a new architecture enabled by dbt Cloud. It allows you to better manage complexity by deploying multiple interconnected dbt projects instead of a single large, monolithic project. It’s designed to accelerate development, without compromising governance. + +## Overview of Mesh + + + +Here are some benefits of implementing dbt Mesh: + +* **Ship data products faster**: With a more modular architecture, teams can make changes rapidly and independently in specific areas without impacting the entire system, leading to faster development cycles. +* **Improve trust in data:** Adopting dbt Mesh helps ensure that changes in one domain's data models do not unexpectedly break dependencies in other domain areas, leading to a more secure and predictable data environment. +* **Reduce complexity**: By organizing transformation logic into distinct domains, dbt Mesh reduces the complexity inherent in large, monolithic projects, making them easier to manage and understand. +* **Improve collaboration**: Teams are able to share and build upon each other's work without duplicating efforts. + +Most importantly, all this can be accomplished without the central data team losing the ability to see lineage across the entire organization, or compromising on governance mechanisms. + + + + + +dbt [model contracts](/docs/collaborate/govern/model-contracts) serve as a governance tool enabling the definition and enforcement of data structure standards in your dbt models. They allow you to specify and uphold data model guarantees, including column data types, allowing for the stability of dependent models. Should a model fail to adhere to its established contracts, it will not build successfully. + + + + + +dbt [model versions](https://docs.getdbt.com/docs/collaborate/govern/model-versions) are iterations of your dbt models made over time. In many cases, you might knowingly choose to change a model’s structure in a way that “breaks” the previous model contract, and may break downstream queries depending on that model’s structure. When you do so, creating a new version of the model is useful to signify this change. + +You can use model versions to: + +- Test "prerelease" changes (in production, in downstream systems). +- Bump the latest version, to be used as the canonical "source of truth." +- Offer a migration window off the "old" version. + + + + + +A [model access modifier](/docs/collaborate/govern/model-access) in dbt determines if a model is accessible as an input to other dbt models and projects. It specifies where a model can be referenced using [the `ref` function](/reference/dbt-jinja-functions/ref). There are three types of access modifiers: + +* **Private:** A model with a private access modifier is only referenceable by models within the same group. This is intended for models that are implementation details and are meant to be used only within a specific group of related models. +* **Protected:** Models with a protected access modifier can be referenced by any other model within the same dbt project or when the project is installed as a package. This is the default setting for all models, ensuring backward compatibility, especially when groups are assigned to an existing set of models. +* **Public:** A public model can be referenced across different groups, packages, or projects. This is suitable for stable and mature models that serve as interfaces for other teams or projects. + + + + + +A [model group](/docs/collaborate/govern/model-access#groups) in dbt is a concept used to organize models under a common category or ownership. This categorization can be based on various criteria, such as the team responsible for the models or the specific data source they model. + + + + + +This is a new way of working, and the intentionality required to build, and then maintain, cross-project interfaces and dependencies may feel like a slowdown versus what some developers are used to. The intentional friction introduced promotes thoughtful changes, fostering a mindset that values stability and systematic adjustments over rapid transformations. + +Orchestration across multiple projects is also likely to be slightly more challenging for many organizations, although we’re currently developing new functionality that will make this process simpler. + + + + + +dbt Mesh allows you to better _operationalize_ data mesh by enabling decentralized, domain-specific data ownership and collaboration. + +In data mesh, each business domain is responsible for its data as a product. This is the same goal that dbt Mesh facilitates by enabling organizations to break down large, monolithic data projects into smaller, domain-specific dbt projects. Each team or domain can independently develop, maintain, and share its data models, fostering a decentralized data environment. + +dbt Mesh also enhances the interoperability and reusability of data across different domains, a key aspect of the data mesh philosophy. By allowing cross-project references and shared governance through model contracts and access controls, dbt Mesh ensures that while data ownership is decentralized, there is still a governed structure to the overall data architecture. + + + +## How dbt Mesh works + + + +Like resource dependencies, project dependencies are acyclic, meaning they only move in one direction. This prevents `ref` cycles (or loops). For example, if project B depends on project A, a new model in project A could not import and use a public model from project B. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies#how-to-use-ref) for more information. + + + + + +While it’s not currently possible to share sources across projects, it would be possible to have a shared foundational project, with staging models on top of those sources, exposed as “public” models to other teams/projects. + + + + + +This would be a breaking change for downstream consumers of that model. If the maintainers of the upstream project wish to remove the model (or “downgrade” its access modifier, effectively the same thing), they should mark that model for deprecation (using [deprecation_date](/reference/resource-properties/deprecation_date)), which will deliver a warning to all downstream consumers of that model. + +In the future, we plan for dbt Cloud to also be able to proactively flag this scenario in [continuous integration](/docs/deploy/continuous-integration) for the maintainers of the upstream public model. + + + + + +No, unless downstream projects are installed as [packages](/docs/build/packages) (source code). In that case, the models in project installed as a project become “your” models, and you can select or run them. There are cases in which this can be desirable; see docs on [project dependencies](/docs/collaborate/govern/project-dependencies). + + + + + +Yes, as long as they’re in the same data platform (BigQuery, Databricks, Redshift, Snowflake, etc.) and you have configured permissions and sharing in that data platform provider to allow this. + + + + + +Yes, because the cross-project collaboration is done using the `{{ ref() }}` macro, you can use those models from other teams in [singular tests](/docs/build/data-tests#singular-data-tests). + + + + + +Each team defines their connection to the data warehouse, and the default schema names for dbt to use when materializing datasets. + +By default, each project belonging to a team will create: + +- One schema for production runs (for example, `finance`). +- One schema per developer (for example, `dev_jerco`). + +Depending on each team’s needs, this can be customized with model-level [schema configurations](/docs/build/custom-schemas), including the ability to define different rules by environment. + + + + + +No, contracts can only be applied at the [model level](/docs/collaborate/govern/model-contracts). It is a recommended best practice to [define staging models](/best-practices/how-we-structure/2-staging) on top of sources, and it is possible to define contracts on top of those staging models. + + + + + +No. A contract applies to an entire model, including all columns in the model’s output. This is the same set of columns that a consumer would see when viewing the model’s details in Explorer, or when querying the model in the data platform. + +- If you wish to contract only a subset of columns, you can create a separate model (materialized as a view) selecting only that subset. +- If you wish to limit which rows or columns a downstream consumer can see when they query the model’s data, depending on who they are, some data platforms offer advanced capabilities around dynamic row-level access and column-level data masking. + + + + + +No, a [group](/docs/collaborate/govern/model-access#groups) can only be assigned to a single owner. However, the assigned owner can be a _team_, rather than an individual. + + + + + +Not directly, but contracts are [assigned to models](/docs/collaborate/govern/model-contracts) and models can be assigned to individual owners. You can use meta fields for this purpose. + + + + + +This is not currently possible, but something we hope to enable in the near future. If you’re interested in this functionality, please reach out to your dbt Labs account team. + + + + + +dbt Cloud will soon offer the capability to trigger jobs on the completion of another job, including a job in a different project. This offers one mechanism for executing a pipeline from start to finish across projects. + + + + + +Yes. In addition to being viewable natively through [dbt Explorer](https://www.getdbt.com/product/dbt-explorer), it is possible to view cross-project lineage connect using partner integrations with data cataloging tools. For a list of available dbt Cloud integrations, refer to the [Integrations page](https://www.getdbt.com/product/integrations). + + + + + +Tests and model contracts in dbt help eliminate the need to restate data in the first place. With these tools, you can incorporate checks at the source and output layers of your dbt projects to assess data quality in the most critical places. When there are changes in transformation logic (for example, the definition of a particular column is changed), restating the data is as easy as merging the updated code and running a dbt Cloud job. + +If a data quality issue does slip through, you also have the option of simply rolling back the git commit, and then re-running the dbt Cloud job with the old code. + + + + + +Yes, all of this metadata is accessible via the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api). This metadata can be fed into a monitoring tool, or used to create reports and dashboards. + +We also expose some of this information in dbt Cloud itself in [jobs](/docs/deploy/jobs), [environments](/docs/environments-in-dbt) and in [dbt Explorer](https://www.getdbt.com/product/dbt-explorer). + + + +## Permissions and access + + + +The existence of projects that have at least one public model will be visible to everyone in the organization with [read-only access](/docs/cloud/manage-access/seats-and-users). + +Private or protected models require a user to have read-only access on the specific project in order to see its existence. + + + + + +There’s model-level access within dbt, role-based access for users and groups in dbt Cloud, and access to the underlying data within the data platform. + +First things first: access to underlying data is always defined and enforced by the underlying data platform (for example, BigQuery, Databricks, Redshift, Snowflake, Starburst, etc.) This access is managed by executing “DCL statements” (namely `grant`). dbt makes it easy to [configure `grants` on models](/reference/resource-configs/grants), which provision data access for other roles/users/groups in the data warehouse. However, dbt does _not_ automatically define or coordinate those grants unless they are configured explicitly. Refer to your organization's system for managing data warehouse permissions. + +[dbt Cloud Enterprise plans](https://www.getdbt.com/pricing) support [role-based access control (RBAC)](/docs/cloud/manage-access/enterprise-permissions#how-to-set-up-rbac-groups-in-dbt-cloud) that manages granular permissions for users and user groups. You can control which users can see or edit all aspects of a dbt Cloud project. A user’s access to dbt Cloud projects also determines whether they can “explore” that project in detail. Roles, users, and groups are defined within the dbt Cloud application via the UI or by integrating with an identity provider. + +[Model access](/docs/collaborate/govern/model-access) defines where models can be referenced. It also informs the discoverability of those projects within dbt Explorer. Model `access` is defined in code, just like any other model configuration (`materialized`, `tags`, etc). + +* **Public:** Models with `public` access can be referenced everywhere. These are the “data products” of your organization. + +* **Protected:** Models with `protected` access can only be referenced within the same project. This is the default level of model access. +We are discussing a future extension to `protected` models to allow for their reference in _specific_ downstream projects. Please read [the GitHub issue](https://github.com/dbt-labs/dbt-core/issues/9340), and upvote/comment if you’re interested in this use case. + +* **Private:** Model `groups` enable more-granular control over where `private` models can be referenced. By defining a group, and configuring models to belong to that group, you can restrict other models (not in the same group) from referencing any `private` models the group contains. Groups also provide a standard mechanism for defining the `owner` of all resources it contains. + +Within dbt Explorer, `public` models are discoverable for every user in the dbt Cloud account — every public model is listed in the “multi-project” view. By contrast, `protected` and `private` models in a project are visible only to users who have access to that project (including read-only access). + +Because dbt does not implicitly coordinate data warehouse `grants` with model-level `access`, it is possible for there to be a mismatch between them. For example, a `public` model’s metadata is viewable to all dbt Cloud users, anyone can write a `ref` to that model, but when they actually run or preview, they realize they do not have access to the underlying data in the data warehouse. **This is intentional.** In this way, your organization can retain least-privileged access to underlying data, while providing visibility and discoverability for the wider organization. Armed with the knowledge of which other “data products” (public models) exist — their descriptions, their ownership, which columns they contain — an analyst on another team can prepare a well-informed request for access to the underlying data. + + + + + +Not currently! But this is something we may evaluate for the future. + + + + + +Yes! As long as a user has permissions (at least read-only access) on all projects in a dbt Cloud account, they can navigate across the entirety of the organization’s DAG in dbt Explorer, and see models at all levels of detail. + + + + + +By default, cross-project references resolve to the “Production” deployment environment of the upstream project. If your organization has genuinely different data in production versus non-production environments, this poses an issue. + +For this reason, we will soon roll out a new canonical type of deployment environment: “Staging.” If a project defines both a “Production” environment and a “Staging” environment, then cross-project references from development and “Staging” environments will resolve to “Staging,” whereas only references coming from “Production” environments will resolve to “Production.” In this way, you are guaranteed separation of data environments, without needing to duplicate project configurations. + +If you’re interested in beta access to “Staging” environments, let your dbt Labs account representative know! + + + +## Compatibility with other features + + + +The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dbt Mesh are complementary mechanisms enabled by dbt Cloud that work together to enhance the management, usability, and governance of data in large-scale data environments. + +The Semantic Layer in dbt Cloud allows teams to centrally define business metrics and dimensions. It ensures consistent and reliable metric definitions across various analytics tools and platforms. + +dbt Mesh enables organizations to split their data architecture into multiple domain-specific projects, while retaining the ability to reference “public” models across projects. It is also possible to reference a “public” model from another project for the purpose of defining semantic models and metrics. Your organization can have multiple dbt projects feed into a unified semantic layer, ensuring that metrics and dimensions are consistently defined and understood across these domains. + + + + + +**[dbt Explorer](/docs/collaborate/explore-projects)** is a tool within dbt Cloud that serves as a knowledge base and lineage visualization platform. It provides a comprehensive view of your dbt assets, including models, tests, sources, and their interdependencies. + +Used in conjunction with dbt Mesh, dbt Explorer becomes a powerful tool for visualizing and understanding the relationships and dependencies between models across multiple dbt projects. + + + + + +The [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) allows users to develop and run dbt commands from their preferred development environments, like VS Code, Sublime Text, or terminal interfaces. This flexibility is particularly beneficial in a dbt Mesh setup, where managing multiple projects can be complex. Developers can work in their preferred tools while leveraging the centralized capabilities of dbt Cloud. + + + +## Availability + + + +Yes, your account must be on [at least dbt v1.6](/docs/dbt-versions/upgrade-core-in-cloud) to take advantage of [cross-project dependencies](/docs/collaborate/govern/project-dependencies), one of the most crucial underlying capabilities required to implement a dbt Mesh. + + + + + +While dbt Core defines several of the foundational elements for dbt Mesh, dbt Cloud offers an enhanced experience that leverages these elements for scaled collaboration across multiple teams, facilitated by multi-project discovery in dbt Explorer that’s tailored to each user’s access. + +Several key components that underpin the dbt Mesh pattern, including model contracts, versions, and access modifiers, are defined and implemented in dbt Core. We believe these are components of the core language, which is why their implementations are open source. We want to define a standard pattern that analytics engineers everywhere can adopt, extend, and help us improve. + +To reference models defined in another project, users can also leverage [packages](/docs/build/packages), a longstanding feature of dbt Core. By importing an upstream project as a package, dbt will import all models defined in that project, which enables the resolution of cross-project references to those models. They can be [optionally restricted](/docs/collaborate/govern/model-access#how-do-i-restrict-access-to-models-defined-in-a-package) to just the models with `public` access. + +The major distinction comes with dbt Cloud's metadata service, which is unique to the dbt Cloud platform and allows for the resolution of references to only the public models in a project. This service enables users to take dependencies on upstream projects, and reference just their `public` models, *without* needing to load the full complexity of those upstream projects into their local development environment. + + + + + +Yes, a [dbt Cloud Enterprise](https://www.getdbt.com/pricing) plan is required to set up multiple projects and reference models across them. + + + +## Tips on implementing dbt Mesh + + + +Refer to our developer guide on [How we structure our dbt Mesh projects](https://docs.getdbt.com/best-practices/how-we-mesh/mesh-1-intro). You may also be interested in watching the recording of this talk from Coalesce 2023: [Unlocking model governance and multi-project deployments with dbt-meshify](https://www.youtube.com/watch?v=FAsY0Qx8EyU). + + + + + +`dbt-meshify` is a [CLI tool](https://github.com/dbt-labs/dbt-meshify) that automates the creation of model governance and cross-project lineage features introduced in dbt-core v1.5 and v1.6. This package will leverage your dbt project metadata to create and/or edit the files in your project to properly configure the models in your project with these features. + + + + +Let’s say your organization has fewer than 500 models and fewer than a dozen regular contributors to dbt. You're operating at a scale well served by the monolith (a single project), and the larger pattern of dbt Mesh probably won't provide any immediate benefits. + +It’s never too early to think about how you’re organizing models _within_ that project. Use model `groups` to define clear ownership boundaries and `private` access to restrict purpose-built models from becoming load-bearing blocks in an unrelated section of the DAG. Your future selves will thank you for defining these interfaces, especially if you reach a scale where it makes sense to “graduate” the interfaces between `groups` into boundaries between projects. + + diff --git a/website/docs/best-practices/materializations/materializations-guide-2-available-materializations.md b/website/docs/best-practices/materializations/materializations-guide-2-available-materializations.md index 9910e5f8269..1096c07cde7 100644 --- a/website/docs/best-practices/materializations/materializations-guide-2-available-materializations.md +++ b/website/docs/best-practices/materializations/materializations-guide-2-available-materializations.md @@ -9,12 +9,12 @@ hoverSnippet: Read this guide to understand the different types of materializati Views and tables and incremental models, oh my! In this section we’ll start getting our hands dirty digging into the three basic materializations that ship with dbt. They are considerably less scary and more helpful than lions, tigers, or bears — although perhaps not as cute (can data be cute? We at dbt Labs think so). We’re going to define, implement, and explore: -- 🔍 **views** -- ⚒️ **tables** -- 📚 **incremental model** +- 🔍 [**views**](/docs/build/materializations#view) +- ⚒️ [**tables**](/docs/build/materializations#table) +- 📚 [**incremental model**](/docs/build/materializations#incremental) :::info -👻 There is a fourth default materialization available in dbt called **ephemeral materialization**. It is less broadly applicable than the other three, and better deployed for specific use cases that require weighing some tradeoffs. We chose to leave it out of this guide and focus on the three materializations that will power 99% of your modeling needs. +👻 There is a fourth default materialization available in dbt called [**ephemeral materialization**](/docs/build/materializations#ephemeral). It is less broadly applicable than the other three, and better deployed for specific use cases that require weighing some tradeoffs. We chose to leave it out of this guide and focus on the three materializations that will power 99% of your modeling needs. ::: **Views and Tables are the two basic categories** of object that we can create across warehouses. They exist natively as types of objects in the warehouse, as you can see from this screenshot of Snowflake (depending on your warehouse the interface will look a little different). **Incremental models** and other materializations types are a little bit different. They tell dbt to **construct tables in a special way**. diff --git a/website/docs/docs/build/about-metricflow.md b/website/docs/docs/build/about-metricflow.md index 75fa3ba5262..19d27bc60d2 100644 --- a/website/docs/docs/build/about-metricflow.md +++ b/website/docs/docs/build/about-metricflow.md @@ -63,6 +63,7 @@ Metrics, which is a key concept, are functions that combine measures, constraint MetricFlow supports different metric types: +- [Conversion](/docs/build/conversion) — Helps you track when a base event and a subsequent conversion event occurs for an entity within a set time period. - [Cumulative](/docs/build/cumulative) — Aggregates a measure over a given window. - [Derived](/docs/build/derived) — An expression of other metrics, which allows you to do calculations on top of metrics. - [Ratio](/docs/build/ratio) — Create a ratio out of two measures, like revenue per customer. diff --git a/website/docs/docs/build/conversion-metrics.md b/website/docs/docs/build/conversion-metrics.md new file mode 100644 index 00000000000..39b3d969b27 --- /dev/null +++ b/website/docs/docs/build/conversion-metrics.md @@ -0,0 +1,355 @@ +--- +title: "Conversion metrics" +id: conversion +description: "Use Conversion metrics to measure conversion events." +sidebar_label: Conversion +tags: [Metrics, Semantic Layer] +--- + +Conversion metrics allow you to define when a base event and a subsequent conversion event happen for a specific entity within some time range. + +For example, using conversion metrics allows you to track how often a user (entity) completes a visit (base event) and then makes a purchase (conversion event) within 7 days (time window). You would need to add a time range and an entity to join. + +Conversion metrics are different from [ratio metrics](/docs/build/ratio) because you need to include an entity in the pre-aggregated join. + +## Parameters + +The specification for conversion metrics is as follows: + +| Parameter | Description | Type | Required/Optional | +| --- | --- | --- | --- | +| `name` | The name of the metric. | String | Required | +| `description` | The description of the metric. | String | Optional | +| `type` | The type of metric (such as derived, ratio, and so on.). In this case, set as 'conversion' | String | Required | +| `label` | Displayed value in downstream tools. | String | Required | +| `type_params` | Specific configurations for each metric type. | List | Required | +| `conversion_type_params` | Additional configuration specific to conversion metrics. | List | Required | +| `entity` | The entity for each conversion event. | Entity | Required | +| `calculation` | Method of calculation. Either `conversion_rate` or `conversions`. Defaults to `conversion_rate`. | String | Optional | +| `base_measure` | The base conversion event measure. | Measure | Required | +| `conversion_measure` | The conversion event measure. | Measure | Required | +| `window` | The time window for the conversion event, such as 7 days, 1 week, 3 months. Defaults to infinity. | String | Optional | +| `constant_properties` | List of constant properties. | List | Optional | +| `base_property` | The property from the base semantic model that you want to hold constant. | Entity or Dimension | Optional | +| `conversion_property` | The property from the conversion semantic model that you want to hold constant. | Entity or Dimension | Optional | +| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | String | Optional | + +Refer to [additional settings](#additional-settings) to learn how to customize conversion metrics with settings for null values, calculation type, and constant properties. + +The following code example displays the complete specification for conversion metrics and details how they're applied: + +```yaml +metrics: + - name: The metric name # Required + description: The metric description # Optional + type: conversion # Required + label: # Required + type_params: # Required + fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional + conversion_type_params: # Required + entity: ENTITY # Required + calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come. + base_measure: MEASURE # Required + conversion_measure: MEASURE # Required + window: TIME_WINDOW # Optional. default: infinity. window to join the two events. Follows a similar format as time windows elsewhere (such as 7 days) + constant_properties: # Optional. List of constant properties default: None + - base_property: DIMENSION or ENTITY # Required. A reference to a dimension/entity of the semantic model linked to the base_measure + conversion_property: DIMENSION or ENTITY # Same as base above, but to the semantic model of the conversion_measure +``` + +## Conversion metric example + +The following example will measure conversions from website visits (`VISITS` table) to order completions (`BUYS` table) and calculate a conversion metric for this scenario step by step. + +Suppose you have two semantic models, `VISITS` and `BUYS`: + +- The `VISITS` table represents visits to an e-commerce site. +- The `BUYS` table represents someone completing an order on that site. + +The underlying tables look like the following: + +`VISITS`
+Contains user visits with `USER_ID` and `REFERRER_ID`. + +| DS | USER_ID | REFERRER_ID | +| --- | --- | --- | +| 2020-01-01 | bob | facebook | +| 2020-01-04 | bob | google | +| 2020-01-07 | bob | amazon | + +`BUYS`
+Records completed orders with `USER_ID` and `REFERRER_ID`. + +| DS | USER_ID | REFERRER_ID | +| --- | --- | --- | +| 2020-01-02 | bob | facebook | +| 2020-01-07 | bob | amazon | + +Next, define a conversion metric as follows: + +```yaml +- name: visit_to_buy_conversion_rate_7d + description: "Conversion rate from visiting to transaction in 7 days" + type: conversion + label: Visit to Buy Conversion Rate (7-day window) + type_params: + fills_nulls_with: 0 + conversion_type_params: + base_measure: visits + conversion_measure: sellers + entity: user + window: 7 days +``` + +To calculate the conversion, link the `BUYS` event to the nearest `VISITS` event (or closest base event). The following steps explain this process in more detail: + +### Step 1: Join `VISITS` and `BUYS` + +This step joins the `BUYS` table to the `VISITS` table and gets all combinations of visits-buys events that match the join condition where buys occur within 7 days of the visit (any rows that have the same user and a buy happened at most 7 days after the visit). + +The SQL generated in these steps looks like the following: + +```sql +select + v.ds, + v.user_id, + v.referrer_id, + b.ds, + b.uuid, + 1 as buys +from visits v +inner join ( + select *, uuid_string() as uuid from buys -- Adds a uuid column to uniquely identify the different rows +) b +on +v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 days' +``` + +The dataset returns the following (note that there are two potential conversion events for the first visit): + +| V.DS | V.USER_ID | V.REFERRER_ID | B.DS | UUID | BUYS | +| --- | --- | --- | --- | --- | --- | +| 2020-01-01 | bob | facebook | 2020-01-02 | uuid1 | 1 | +| 2020-01-01 | bob | facebook | 2020-01-07 | uuid2 | 1 | +| 2020-01-04 | bob | google | 2020-01-07 | uuid2 | 1 | +| 2020-01-07 | bob | amazon | 2020-01-07 | uuid2 | 1 | + +### Step 2: Refine with window function + +Instead of returning the raw visit values, use window functions to link conversions to the closest base event. You can partition by the conversion source and get the `first_value` ordered by `visit ds`, descending to get the closest base event from the conversion event: + +```sql +select + first_value(v.ds) over (partition by b.ds, b.user_id, b.uuid order by v.ds desc) as v_ds, + first_value(v.user_id) over (partition by b.ds, b.user_id, b.uuid order by v.ds desc) as user_id, + first_value(v.referrer_id) over (partition by b.ds, b.user_id, b.uuid order by v.ds desc) as referrer_id, + b.ds, + b.uuid, + 1 as buys +from visits v +inner join ( + select *, uuid_string() as uuid from buys +) b +on +v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 day' +``` + +The dataset returns the following: + +| V.DS | V.USER_ID | V.REFERRER_ID | B.DS | UUID | BUYS | +| --- | --- | --- | --- | --- | --- | +| 2020-01-01 | bob | facebook | 2020-01-02 | uuid1 | 1 | +| 2020-01-07 | bob | amazon | 2020-01-07 | uuid2 | 1 | +| 2020-01-07 | bob | amazon | 2020-01-07 | uuid2 | 1 | +| 2020-01-07 | bob | amazon | 2020-01-07 | uuid2 | 1 | + +This workflow links the two conversions to the correct visit events. Due to the join, you end up with multiple combinations, leading to fanout results. After applying the window function, duplicates appear. + +To resolve this and eliminate duplicates, use a distinct select. The UUID also helps identify which conversion is unique. The next steps provide more detail on how to do this. + +### Step 3: Remove duplicates + +Instead of regular select used in the [Step 2](#step-2-refine-with-window-function), use a distinct select to remove the duplicates: + +```sql +select distinct + first_value(v.ds) over (partition by b.ds, b.user_id, b.uuid order by v.ds desc) as v_ds, + first_value(v.user_id) over (partition by b.ds, b.user_id, b.uuid order by v.ds desc) as user_id, + first_value(v.referrer_id) over (partition by b.ds, b.user_id, b.uuid order by v.ds desc) as referrer_id, + b.ds, + b.uuid, + 1 as buys +from visits v +inner join ( + select *, uuid_string() as uuid from buys +) b +on +v.user_id = b.user_id and v.ds <= b.ds and v.ds > b.ds - interval '7 day'; +``` + +The dataset returns the following: + +| V.DS | V.USER_ID | V.REFERRER_ID | B.DS | UUID | BUYS | +| --- | --- | --- | --- | --- | --- | +| 2020-01-01 | bob | facebook | 2020-01-02 | uuid1 | 1 | +| 2020-01-07 | bob | amazon | 2020-01-07 | uuid2 | 1 | + +You now have a dataset where every conversion is connected to a visit event. To proceed: + +1. Sum up the total conversions in the "conversions" table. +2. Combine this table with the "opportunities" table, matching them based on group keys. +3. Calculate the conversion rate. + +### Step 4: Aggregate and calculate + +Now that you’ve tied each conversion event to a visit, you can calculate the aggregated conversions and opportunities measures. Then, you can join them to calculate the actual conversion rate. The SQL to calculate the conversion rate is as follows: + +```sql +select + coalesce(subq_3.metric_time__day, subq_13.metric_time__day) as metric_time__day, + cast(max(subq_13.buys) as double) / cast(nullif(max(subq_3.visits), 0) as double) as visit_to_buy_conversion_rate_7d +from ( -- base measure + select + metric_time__day, + sum(visits) as mqls + from ( + select + date_trunc('day', first_contact_date) as metric_time__day, + 1 as visits + from visits + ) subq_2 + group by + metric_time__day +) subq_3 +full outer join ( -- conversion measure + select + metric_time__day, + sum(buys) as sellers + from ( + -- ... + -- The output of this subquery is the table produced in Step 3. The SQL is hidden for legibility. + -- To see the full SQL output, add --explain to your conversion metric query. + ) subq_10 + group by + metric_time__day +) subq_13 +on + subq_3.metric_time__day = subq_13.metric_time__day +group by + metric_time__day +``` + +### Additional settings + +Use the following additional settings to customize your conversion metrics: + +- **Null conversion values:** Set null conversions to zero using `fill_nulls_with`. +- **Calculation type:** Choose between showing raw conversions or conversion rate. +- **Constant property:** Add conditions for specific scenarios to join conversions on constant properties. + + + + +To return zero in the final data set, you can set the value of a null conversion event to zero instead of null. You can add the `fill_nulls_with` parameter to your conversion metric definition like this: + +```yaml +- name: visit_to_buy_conversion_rate_7_day_window + description: "Conversion rate from viewing a page to making a purchase" + type: conversion + label: Visit to Seller Conversion Rate (7 day window) + type_params: + conversion_type_params: + calculation: conversions + base_measure: visits + conversion_measure: + name: buys + fill_nulls_with: 0 + entity: user + window: 7 days + +``` + +This will return the following results: + + + + + + + +Use the conversion calculation parameter to either show the raw number of conversions or the conversion rate. The default value is the conversion rate. + +You can change the default to display the number of conversions by setting the `calculation: conversion` parameter: + +```yaml +- name: visit_to_buy_conversions_1_week_window + description: "Visit to Buy Conversions" + type: conversion + label: Visit to Buy Conversions (1 week window) + type_params: + conversion_type_params: + calculation: conversions + base_measure: visits + conversion_measure: + name: buys + fill_nulls_with: 0 + entity: user + window: 1 week +``` + + + + + +*Refer to [Amplitude's blog posts on constant properties](https://amplitude.com/blog/holding-constant) to learn about this concept.* + +You can add a constant property to a conversion metric to count only those conversions where a specific dimension or entity matches in both the base and conversion events. + +For example, if you're at an e-commerce company and want to answer the following question: +- _How often did visitors convert from `View Item Details` to `Complete Purchase` with the same product in each step?_
+ - This question is tricky to answer because users could have completed these two conversion milestones across many products. For example, they may have viewed a pair of shoes, then a T-shirt, and eventually checked out with a bow tie. This would still count as a conversion, even though the conversion event only happened for the bow tie. + +Back to the initial questions, you want to see how many customers viewed an item detail page and then completed a purchase for the _same_ product. + +In this case, you want to set `product_id` as the constant property. You can specify this in the configs as follows: + +```yaml +- name: view_item_detail_to_purchase_with_same_item + description: "Conversion rate for users who viewed the item detail page and purchased the item" + type: Conversion + label: View Item Detail > Purchase + type_params: + conversion_type_params: + calculation: conversions + base_measure: view_item_detail + conversion_measure: purchase + entity: user + window: 1 week + constant_properties: + - base_property: product + conversion_property: product +``` + +You will add an additional condition to the join to make sure the constant property is the same across conversions. + +```sql +select distinct + first_value(v.ds) over (partition by buy_source.ds, buy_source.user_id, buy_source.session_id order by v.ds desc rows between unbounded preceding and unbounded following) as ds, + first_value(v.user_id) over (partition by buy_source.ds, buy_source.user_id, buy_source.session_id order by v.ds desc rows between unbounded preceding and unbounded following) as user_id, + first_value(v.referrer_id) over (partition by buy_source.ds, buy_source.user_id, buy_source.session_id order by v.ds desc rows between unbounded preceding and unbounded following) as referrer_id, + buy_source.uuid, + 1 as buys +from {{ source_schema }}.fct_view_item_details v +inner join + ( + select *, {{ generate_random_uuid() }} as uuid from {{ source_schema }}.fct_purchases + ) buy_source +on + v.user_id = buy_source.user_id + and v.ds <= buy_source.ds + and v.ds > buy_source.ds - interval '7 day' + and buy_source.product_id = v.product_id --Joining on the constant property product_id +``` + +
+
diff --git a/website/docs/docs/build/cumulative-metrics.md b/website/docs/docs/build/cumulative-metrics.md index 45a136df751..ec962969c9e 100644 --- a/website/docs/docs/build/cumulative-metrics.md +++ b/website/docs/docs/build/cumulative-metrics.md @@ -20,6 +20,7 @@ This metric is common for calculating things like weekly active users, or month- | `measure` | The measure you are referencing. | Required | | `window` | The accumulation window, such as 1 month, 7 days, 1 year. This can't be used with `grain_to_date`. | Optional | | `grain_to_date` | Sets the accumulation grain, such as month will accumulate data for one month. Then restart at the beginning of the next. This can't be used with `window`. | Optional | +| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero).| Optional | The following displays the complete specification for cumulative metrics, along with an example: @@ -30,13 +31,15 @@ metrics: type: cumulative # Required label: The value that will be displayed in downstream tools # Required type_params: # Required + fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional measure: The measure you are referencing # Required - window: The accumulation window, such as 1 month, 7 days, 1 year. # Optional. Can not be used with window. - grain_to_date: Sets the accumulation grain, such as month will accumulate data for one month, then restart at the beginning of the next. # Optional. Cannot be used with grain_to_date + window: The accumulation window, such as 1 month, 7 days, 1 year. # Optional. Cannot be used with grain_to_date + grain_to_date: Sets the accumulation grain, such as month will accumulate data for one month, then restart at the beginning of the next. # Optional. Cannot be used with window ``` ## Limitations + Cumulative metrics are currently under active development and have the following limitations: - You are required to use [`metric_time` dimension](/docs/build/dimensions#time) when querying cumulative metrics. If you don't use `metric_time` in the query, the cumulative metric will return incorrect results because it won't perform the time spine join. This means you cannot reference time dimensions other than the `metric_time` in the query. @@ -59,12 +62,14 @@ metrics: description: The cumulative value of all orders type: cumulative type_params: + fill_nulls_with: 0 measure: order_total - name: cumulative_order_total_l1m label: Cumulative Order total (L1M) description: Trailing 1 month cumulative order amount type: cumulative type_params: + fills_nulls_with: 0 measure: order_total window: 1 month - name: cumulative_order_total_mtd @@ -72,6 +77,7 @@ metrics: description: The month to date value of all orders type: cumulative type_params: + fills_nulls_with: 0 measure: order_total grain_to_date: month ``` @@ -201,16 +207,16 @@ The current method connects the metric table to a timespine table using the prim ``` sql select - count(distinct distinct_users) as weekly_active_users - , metric_time + count(distinct distinct_users) as weekly_active_users, + metric_time from ( select - subq_3.distinct_users as distinct_users - , subq_3.metric_time as metric_time + subq_3.distinct_users as distinct_users, + subq_3.metric_time as metric_time from ( select - subq_2.distinct_users as distinct_users - , subq_1.metric_time as metric_time + subq_2.distinct_users as distinct_users, + subq_1.metric_time as metric_time from ( select metric_time @@ -223,8 +229,8 @@ from ( ) subq_1 inner join ( select - distinct_users as distinct_users - , date_trunc('day', ds) as metric_time + distinct_users as distinct_users, + date_trunc('day', ds) as metric_time from demo_schema.transactions transactions_src_426 where ( (date_trunc('day', ds)) >= cast('1999-12-26' as timestamp) @@ -241,6 +247,7 @@ from ( ) subq_3 ) group by - metric_time -limit 100 + metric_time, +limit 100; + ``` diff --git a/website/docs/docs/build/derived-metrics.md b/website/docs/docs/build/derived-metrics.md index fc7961bbe7f..35adb12cb1a 100644 --- a/website/docs/docs/build/derived-metrics.md +++ b/website/docs/docs/build/derived-metrics.md @@ -21,7 +21,8 @@ In MetricFlow, derived metrics are metrics created by defining an expression usi | `metrics` | The list of metrics used in the derived metrics. | Required | | `alias` | Optional alias for the metric that you can use in the expr. | Optional | | `filter` | Optional filter to apply to the metric. | Optional | -| `offset_window` | Set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time. | Required | +| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | +| `offset_window` | Set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time. | Optional | The following displays the complete specification for derived metrics, along with an example. @@ -32,12 +33,13 @@ metrics: type: derived # Required label: The value that will be displayed in downstream tools #Required type_params: # Required + fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional expr: the derived expression # Required metrics: # The list of metrics used in the derived metrics # Required - name: the name of the metrics. must reference a metric you have already defined # Required alias: optional alias for the metric that you can use in the expr # Optional filter: optional filter to apply to the metric # Optional - offset_window: set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time. # Required + offset_window: set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time. # Optional ``` ## Derived metrics example @@ -49,6 +51,7 @@ metrics: type: derived label: Order Gross Profit type_params: + fill_nulls_with: 0 expr: revenue - cost metrics: - name: order_total @@ -60,6 +63,7 @@ metrics: description: "The gross profit for each food order." type: derived type_params: + fill_nulls_with: 0 expr: revenue - cost metrics: - name: order_total @@ -96,6 +100,7 @@ The following example displays how you can calculate monthly revenue growth usin description: Percentage of customers that are active now and those active 1 month ago label: customer_retention type_params: + fill_nulls_with: 0 expr: (active_customers/ active_customers_prev_month) metrics: - name: active_customers @@ -115,6 +120,7 @@ You can query any granularity and offset window combination. The following examp type: derived label: d7 Bookings Change type_params: + fill_nulls_with: 0 expr: bookings - bookings_7_days_ago metrics: - name: bookings @@ -126,10 +132,10 @@ You can query any granularity and offset window combination. The following examp When you run the query `dbt sl query --metrics d7_booking_change --group-by metric_time__month` for the metric, here's how it's calculated. For dbt Core, you can use the `mf query` prefix. -1. We retrieve the raw, unaggregated dataset with the specified measures and dimensions at the smallest level of detail, which is currently 'day'. -2. Then, we perform an offset join on the daily dataset, followed by performing a date trunc and aggregation to the requested granularity. +1. Retrieve the raw, unaggregated dataset with the specified measures and dimensions at the smallest level of detail, which is currently 'day'. +2. Then, perform an offset join on the daily dataset, followed by performing a date trunc and aggregation to the requested granularity. For example, to calculate `d7_booking_change` for July 2017: - - First, we sum up all the booking values for each day in July to calculate the bookings metric. + - First, sum up all the booking values for each day in July to calculate the bookings metric. - The following table displays the range of days that make up this monthly aggregation. | | Orders | Metric_time | @@ -139,7 +145,7 @@ When you run the query `dbt sl query --metrics d7_booking_change --group-by met | | 78 | 2017-07-01 | | Total | 7438 | 2017-07-01 | -3. Next, we calculate July's bookings with a 7-day offset. The following table displays the range of days that make up this monthly aggregation. Note that the month begins 7 days later (offset by 7 days) on 2017-07-24. +3. Calculate July's bookings with a 7-day offset. The following table displays the range of days that make up this monthly aggregation. Note that the month begins 7 days later (offset by 7 days) on 2017-07-24. | | Orders | Metric_time | | - | ---- | -------- | @@ -148,7 +154,7 @@ When you run the query `dbt sl query --metrics d7_booking_change --group-by met | | 83 | 2017-06-24 | | Total | 7252 | 2017-07-01 | -4. Lastly, we calculate the derived metric and return the final result set: +4. Lastly, calculate the derived metric and return the final result set: ```bash bookings - bookings_7_days_ago would be compile as 7438 - 7252 = 186. diff --git a/website/docs/docs/build/groups.md b/website/docs/docs/build/groups.md index d4fda045277..62c4e4493d3 100644 --- a/website/docs/docs/build/groups.md +++ b/website/docs/docs/build/groups.md @@ -7,18 +7,6 @@ keywords: - groups access mesh --- -:::info New functionality -This functionality is new in v1.5. -::: - -## Related docs - -* [Model Access](/docs/collaborate/govern/model-access#groups) -* [Group configuration](/reference/resource-configs/group) -* [Group selection](/reference/node-selection/methods#the-group-method) - -## About groups - A group is a collection of nodes within a dbt DAG. Groups are named, and every group has an `owner`. They enable intentional collaboration within and across teams by restricting [access to private](/reference/resource-configs/access) models. Group members may include models, tests, seeds, snapshots, analyses, and metrics. (Not included: sources and exposures.) Each node may belong to only one group. @@ -126,3 +114,9 @@ dbt.exceptions.DbtReferenceError: Parsing Error Node model.jaffle_shop.marketing_model attempted to reference node model.jaffle_shop.finance_model, which is not allowed because the referenced node is private to the finance group. ``` + +## Related docs + +* [Model Access](/docs/collaborate/govern/model-access#groups) +* [Group configuration](/reference/resource-configs/group) +* [Group selection](/reference/node-selection/methods#the-group-method) \ No newline at end of file diff --git a/website/docs/docs/build/incremental-models.md b/website/docs/docs/build/incremental-models.md index cc45290ae15..9f1c206f5fb 100644 --- a/website/docs/docs/build/incremental-models.md +++ b/website/docs/docs/build/incremental-models.md @@ -236,7 +236,7 @@ Instead, whenever the logic of your incremental changes, execute a full-refresh ## About `incremental_strategy` -There are various ways (strategies) to implement the concept of an incremental materializations. The value of each strategy depends on: +There are various ways (strategies) to implement the concept of incremental materializations. The value of each strategy depends on: * the volume of data, * the reliability of your `unique_key`, and @@ -450,5 +450,129 @@ The syntax depends on how you configure your `incremental_strategy`: +### Built-in strategies + +Before diving into [custom strategies](#custom-strategies), it's important to understand the built-in incremental strategies in dbt and their corresponding macros: + +| `incremental_strategy` | Corresponding macro | +|------------------------|----------------------------------------| +| `append` | `get_incremental_append_sql` | +| `delete+insert` | `get_incremental_delete_insert_sql` | +| `merge` | `get_incremental_merge_sql` | +| `insert_overwrite` | `get_incremental_insert_overwrite_sql` | + + +For example, a built-in strategy for the `append` can be defined and used with the following files: + + + +```sql +{% macro get_incremental_append_sql(arg_dict) %} + + {% do return(some_custom_macro_with_sql(arg_dict["target_relation"], arg_dict["temp_relation"], arg_dict["unique_key"], arg_dict["dest_columns"], arg_dict["incremental_predicates"])) %} + +{% endmacro %} + + +{% macro some_custom_macro_with_sql(target_relation, temp_relation, unique_key, dest_columns, incremental_predicates) %} + + {%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute="name")) -%} + + insert into {{ target_relation }} ({{ dest_cols_csv }}) + ( + select {{ dest_cols_csv }} + from {{ temp_relation }} + ) + +{% endmacro %} +``` + + +Define a model models/my_model.sql: + +```sql +{{ config( + materialized="incremental", + incremental_strategy="append", +) }} + +select * from {{ ref("some_model") }} +``` + +### Custom strategies + + + +Custom incremental strategies can be defined beginning in dbt v1.2. + + + + + +As an easier alternative to [creating an entirely new materialization](/guides/create-new-materializations), users can define and use their own "custom" user-defined incremental strategies by: + +1. defining a macro named `get_incremental_STRATEGY_sql`. Note that `STRATEGY` is a placeholder and you should replace it with the name of your custom incremental strategy. +2. configuring `incremental_strategy: STRATEGY` within an incremental model + +dbt won't validate user-defined strategies, it will just look for the macro by that name, and raise an error if it can't find one. + +For example, a user-defined strategy named `insert_only` can be defined and used with the following files: + + + +```sql +{% macro get_incremental_insert_only_sql(arg_dict) %} + + {% do return(some_custom_macro_with_sql(arg_dict["target_relation"], arg_dict["temp_relation"], arg_dict["unique_key"], arg_dict["dest_columns"], arg_dict["incremental_predicates"])) %} + +{% endmacro %} + + +{% macro some_custom_macro_with_sql(target_relation, temp_relation, unique_key, dest_columns, incremental_predicates) %} + + {%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute="name")) -%} + + insert into {{ target_relation }} ({{ dest_cols_csv }}) + ( + select {{ dest_cols_csv }} + from {{ temp_relation }} + ) + +{% endmacro %} +``` + + + + + +```sql +{{ config( + materialized="incremental", + incremental_strategy="insert_only", + ... +) }} + +... +``` + + + +### Custom strategies from a package + +To use the `merge_null_safe` custom incremental strategy from the `example` package: +- [Install the package](/docs/build/packages#how-do-i-add-a-package-to-my-project) +- Then add the following macro to your project: + + + +```sql +{% macro get_incremental_merge_null_safe_sql(arg_dict) %} + {% do return(example.get_incremental_merge_null_safe_sql(arg_dict)) %} +{% endmacro %} +``` + + + + diff --git a/website/docs/docs/build/materializations.md b/website/docs/docs/build/materializations.md index 67796afdbdb..9ae6021cc71 100644 --- a/website/docs/docs/build/materializations.md +++ b/website/docs/docs/build/materializations.md @@ -120,7 +120,7 @@ required with incremental materializations * `dbt run` on materialized views corresponds to a code deployment, just like views * **Cons:** * Due to the fact that materialized views are more complex database objects, database platforms tend to have -less configuration options available, see your database platform's docs for more details +fewer configuration options available; see your database platform's docs for more details * Materialized views may not be supported by every database platform * **Advice:** * Consider materialized views for use cases where incremental models are sufficient, but you would like the data platform to manage the incremental logic and refresh. diff --git a/website/docs/docs/build/metricflow-commands.md b/website/docs/docs/build/metricflow-commands.md index e3bb93da964..a0964269e68 100644 --- a/website/docs/docs/build/metricflow-commands.md +++ b/website/docs/docs/build/metricflow-commands.md @@ -17,15 +17,16 @@ MetricFlow is compatible with Python versions 3.8, 3.9, 3.10, and 3.11. MetricFlow is a dbt package that allows you to define and query metrics in your dbt project. You can use MetricFlow to query metrics in your dbt project in the dbt Cloud CLI, dbt Cloud IDE, or dbt Core. -**Note** — MetricFlow commands aren't supported in dbt Cloud jobs yet. However, you can add MetricFlow validations with your git provider (such as GitHub Actions) by installing MetricFlow (`python -m pip install metricflow`). This allows you to run MetricFlow commands as part of your continuous integration checks on PRs. +Using MetricFlow with dbt Cloud means you won't need to manage versioning — your dbt Cloud account will automatically manage the versioning. + +**dbt Cloud jobs** — MetricFlow commands aren't supported in dbt Cloud jobs yet. However, you can add MetricFlow validations with your git provider (such as GitHub Actions) by installing MetricFlow (`python -m pip install metricflow`). This allows you to run MetricFlow commands as part of your continuous integration checks on PRs. -MetricFlow commands are embedded in the dbt Cloud CLI, which means you can immediately run them once you install the dbt Cloud CLI. - -A benefit to using the dbt Cloud is that you won't need to manage versioning — your dbt Cloud account will automatically manage the versioning. +- MetricFlow commands are embedded in the dbt Cloud CLI. This means you can immediately run them once you install the dbt Cloud CLI and don't need to install MetricFlow separately. +- You don't need to manage versioning — your dbt Cloud account will automatically manage the versioning for you. @@ -35,7 +36,7 @@ A benefit to using the dbt Cloud is that you won't need to manage versioning &md You can create metrics using MetricFlow in the dbt Cloud IDE. However, support for running MetricFlow commands in the IDE will be available soon. ::: -A benefit to using the dbt Cloud is that you won't need to manage versioning — your dbt Cloud account will automatically manage the versioning. + diff --git a/website/docs/docs/build/metrics-overview.md b/website/docs/docs/build/metrics-overview.md index b6ccc1c3b9c..f6844c60498 100644 --- a/website/docs/docs/build/metrics-overview.md +++ b/website/docs/docs/build/metrics-overview.md @@ -9,12 +9,12 @@ pagination_next: "docs/build/cumulative" Once you've created your semantic models, it's time to start adding metrics! Metrics can be defined in the same YAML files as your semantic models, or split into separate YAML files into any other subdirectories (provided that these subdirectories are also within the same dbt project repo) -The keys for metrics definitions are: +The keys for metrics definitions are: | Parameter | Description | Type | | --------- | ----------- | ---- | | `name` | Provide the reference name for the metric. This name must be unique amongst all metrics. | Required | -| `description` | Provide the description for your metric. | Optional | +| `description` | Describe your metric. | Optional | | `type` | Define the type of metric, which can be `simple`, `ratio`, `cumulative`, or `derived`. | Required | | `type_params` | Additional parameters used to configure metrics. `type_params` are different for each metric type. | Required | | `config` | Provide the specific configurations for your metric. | Optional | @@ -22,7 +22,6 @@ The keys for metrics definitions are: | `filter` | You can optionally add a filter string to any metric type, applying filters to dimensions, entities, or time dimensions during metric computation. Consider it as your WHERE clause. | Optional | | `meta` | Additional metadata you want to add to your metric. | Optional | - Here's a complete example of the metrics spec configuration: ```yaml @@ -39,33 +38,51 @@ metrics: null ``` -This page explains the different supported metric types you can add to your dbt project. - +This page explains the different supported metric types you can add to your dbt project. + +### Conversion metrics + +[Conversion metrics](/docs/build/conversion) help you track when a base event and a subsequent conversion event occurs for an entity within a set time period. + +```yaml +metrics: + - name: The metric name # Required + description: The metric description # Optional + type: conversion # Required + label: # Required + type_params: # Required + fills_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional + conversion_type_params: # Required + entity: ENTITY # Required + calculation: CALCULATION_TYPE # Optional. default: conversion_rate. options: conversions(buys) or conversion_rate (buys/visits), and more to come. + base_measure: MEASURE # Required + conversion_measure: MEASURE # Required + window: TIME_WINDOW # Optional. default: infinity. window to join the two events. Follows a similar format as time windows elsewhere (such as 7 days) + constant_properties: # Optional. List of constant properties default: None + - base_property: DIMENSION or ENTITY # Required. A reference to a dimension/entity of the semantic model linked to the base_measure + conversion_property: DIMENSION or ENTITY # Same as base above, but to the semantic model of the conversion_measure +``` ### Cumulative metrics -[Cumulative metrics](/docs/build/cumulative) aggregate a measure over a given window. If no window is specified, the window would accumulate the measure over all time. **Note**, you will need to create the [time spine model](/docs/build/metricflow-time-spine) before you add cumulative metrics. +[Cumulative metrics](/docs/build/cumulative) aggregate a measure over a given window. If no window is specified, the window will accumulate the measure over all of the recorded time period. Note that you will need to create the [time spine model](/docs/build/metricflow-time-spine) before you add cumulative metrics. ```yaml -# Cumulative metrics aggregate a measure over a given window. The window is considered infinite if no window parameter is passed (accumulate the measure over all time) +# Cumulative metrics aggregate a measure over a given window. The window is considered infinite if no window parameter is passed (accumulate the measure over all of time) metrics: - name: wau_rolling_7 owners: - support@getdbt.com type: cumulative type_params: + fills_nulls_with: 0 measures: - distinct_users - #Omitting window will accumulate the measure over all time + # Omitting window will accumulate the measure over all time window: 7 days ``` + ### Derived metrics [Derived metrics](/docs/build/derived) are defined as an expression of other metrics. Derived metrics allow you to do calculations on top of metrics. @@ -77,6 +94,7 @@ metrics: type: derived label: Order Gross Profit type_params: + fills_nulls_with: 0 expr: revenue - cost metrics: - name: order_total @@ -104,7 +122,7 @@ metrics: ### Ratio metrics -[Ratio metrics](/docs/build/ratio) involve a numerator metric and a denominator metric. A `constraint` string can be applied, to both numerator and denominator, or applied separately to the numerator or denominator. +[Ratio metrics](/docs/build/ratio) involve a numerator metric and a denominator metric. A `constraint` string can be applied to both the numerator and denominator or separately to the numerator or denominator. ```yaml # Ratio Metric @@ -116,6 +134,7 @@ metrics: # Define the metrics from the semantic manifest as numerator or denominator type: ratio type_params: + fills_nulls_with: 0 numerator: cancellations denominator: transaction_amount filter: | # add optional constraint string. This applies to both the numerator and denominator @@ -134,6 +153,7 @@ metrics: filter: | # add optional constraint string. This applies to both the numerator and denominator {{ Dimension('customer__country') }} = 'MX' ``` + ### Simple metrics [Simple metrics](/docs/build/simple) point directly to a measure. You may think of it as a function that takes only one measure as the input. @@ -148,6 +168,7 @@ metrics: - name: cancellations type: simple type_params: + fills_nulls_with: 0 measure: cancellations_usd # Specify the measure you are creating a proxy for. filter: | {{ Dimension('order__value')}} > 100 and {{Dimension('user__acquisition')}} @@ -164,15 +185,13 @@ filter: | filter: | {{ TimeDimension('time_dimension', 'granularity') }} ``` + ### Further configuration You can set more metadata for your metrics, which can be used by other tools later on. The way this metadata is used will vary based on the specific integration partner - **Description** — Write a detailed description of the metric. - - - ## Related docs - [Semantic models](/docs/build/semantic-models) diff --git a/website/docs/docs/build/project-variables.md b/website/docs/docs/build/project-variables.md index 59d6be49b17..a328731c7d4 100644 --- a/website/docs/docs/build/project-variables.md +++ b/website/docs/docs/build/project-variables.md @@ -25,13 +25,6 @@ Jinja is not supported within the `vars` config, and all values will be interpre ::: -:::info New in v0.17.0 - -The syntax for specifying vars in the `dbt_project.yml` file has changed in -dbt v0.17.0. See the [migration guide](/docs/dbt-versions/core-upgrade) -for more information on these changes. - -::: To define variables in a dbt project, add a `vars` config to your `dbt_project.yml` file. These `vars` can be scoped globally, or to a specific package imported in your diff --git a/website/docs/docs/build/ratio-metrics.md b/website/docs/docs/build/ratio-metrics.md index 97efe0f55bf..5de4128c1f5 100644 --- a/website/docs/docs/build/ratio-metrics.md +++ b/website/docs/docs/build/ratio-metrics.md @@ -21,6 +21,7 @@ Ratio allows you to create a ratio between two metrics. You simply specify a num | `denominator` | The name of the metric used for the denominator, or structure of properties. | Required | | `filter` | Optional filter for the numerator or denominator. | Optional | | `alias` | Optional alias for the numerator or denominator. | Optional | +| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | The following displays the complete specification for ratio metrics, along with an example. @@ -31,6 +32,7 @@ metrics: type: ratio # Required label: The value that will be displayed in downstream tools #Required type_params: # Required + fill_nulls_with: Set value instead of null (such as zero) # Optional numerator: The name of the metric used for the numerator, or structure of properties # Required name: Name of metric used for the numerator # Required filter: Filter for the numerator # Optional @@ -50,10 +52,11 @@ metrics: label: Food Order Ratio type: ratio type_params: + fill_nulls_with: 0 numerator: food_orders denominator: orders - ``` + ## Ratio metrics using different semantic models The system will simplify and turn the numerator and denominator in a ratio metric from different semantic models by computing their values in sub-queries. It will then join the result set based on common dimensions to calculate the final ratio. Here's an example of the SQL generated for such a ratio metric. @@ -61,16 +64,16 @@ The system will simplify and turn the numerator and denominator in a ratio metri ```sql select - subq_15577.metric_time as metric_time - , cast(subq_15577.mql_queries_created_test as double) / cast(nullif(subq_15582.distinct_query_users, 0) as double) as mql_queries_per_active_user + subq_15577.metric_time as metric_time, + cast(subq_15577.mql_queries_created_test as double) / cast(nullif(subq_15582.distinct_query_users, 0) as double) as mql_queries_per_active_user from ( select - metric_time - , sum(mql_queries_created_test) as mql_queries_created_test + metric_time, + sum(mql_queries_created_test) as mql_queries_created_test from ( select - cast(query_created_at as date) as metric_time - , case when query_status in ('PENDING','MODE') then 1 else 0 end as mql_queries_created_test + cast(query_created_at as date) as metric_time, + case when query_status in ('PENDING','MODE') then 1 else 0 end as mql_queries_created_test from prod_dbt.mql_query_base mql_queries_test_src_2552 ) subq_15576 group by @@ -78,12 +81,12 @@ from ( ) subq_15577 inner join ( select - metric_time - , count(distinct distinct_query_users) as distinct_query_users + metric_time, + count(distinct distinct_query_users) as distinct_query_users from ( select - cast(query_created_at as date) as metric_time - , case when query_status in ('MODE','PENDING') then email else null end as distinct_query_users + cast(query_created_at as date) as metric_time, + case when query_status in ('MODE','PENDING') then email else null end as distinct_query_users from prod_dbt.mql_query_base mql_queries_src_2585 ) subq_15581 group by @@ -115,6 +118,7 @@ metrics: - support@getdbt.com type: ratio type_params: + fill_nulls_with: 0 numerator: name: distinct_purchasers filter: | @@ -124,4 +128,7 @@ metrics: name: distinct_purchasers ``` -Note the `filter` and `alias` parameters for the metric referenced in the numerator. Use the `filter` parameter to apply a filter to the metric it's attached to. The `alias` parameter is used to avoid naming conflicts in the rendered SQL queries when the same metric is used with different filters. If there are no naming conflicts, the `alias` parameter can be left out. +Note the `filter` and `alias` parameters for the metric referenced in the numerator. +- Use the `filter` parameter to apply a filter to the metric it's attached to. +- The `alias` parameter is used to avoid naming conflicts in the rendered SQL queries when the same metric is used with different filters. +- If there are no naming conflicts, the `alias` parameter can be left out. diff --git a/website/docs/docs/build/saved-queries.md b/website/docs/docs/build/saved-queries.md index 9d7ec2060e7..b142437082a 100644 --- a/website/docs/docs/build/saved-queries.md +++ b/website/docs/docs/build/saved-queries.md @@ -22,17 +22,17 @@ All metrics in a saved query need to use the same dimensions in the `group_by` o ```yaml saved_queries: - name: p0_booking - description: Booking-related metrics that are of the highest priority. - query_params: - metrics: - - bookings - - instant_bookings - group_by: - - TimeDimension('metric_time', 'day') - - Dimension('listing__capacity_latest') - where: - - "{{ Dimension('listing__capacity_latest') }} > 3" + - name: p0_booking + description: Booking-related metrics that are of the highest priority. + query_params: + metrics: + - bookings + - instant_bookings + group_by: + - TimeDimension('metric_time', 'day') + - Dimension('listing__capacity_latest') + where: + - "{{ Dimension('listing__capacity_latest') }} > 3" ``` ## Parameters diff --git a/website/docs/docs/build/semantic-models.md b/website/docs/docs/build/semantic-models.md index 5c6883cdcee..afb877db504 100644 --- a/website/docs/docs/build/semantic-models.md +++ b/website/docs/docs/build/semantic-models.md @@ -20,7 +20,7 @@ Semantic models are the foundation for data definition in MetricFlow, which powe -Semantic models have 6 components and this page explains the definitions with some examples: +Here we describe the Semantic model components with examples: | Component | Description | Type | | --------- | ----------- | ---- | diff --git a/website/docs/docs/build/simple.md b/website/docs/docs/build/simple.md index 1803e952a69..fafb770dd04 100644 --- a/website/docs/docs/build/simple.md +++ b/website/docs/docs/build/simple.md @@ -19,6 +19,7 @@ Simple metrics are metrics that directly reference a single measure, without any | `label` | The value that will be displayed in downstream tools. | Required | | `type_params` | The type parameters of the metric. | Required | | `measure` | The measure you're referencing. | Required | +| `fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | The following displays the complete specification for simple metrics, along with an example. @@ -28,9 +29,10 @@ metrics: - name: The metric name # Required description: the metric description # Optional type: simple # Required - label: The value that will be displayed in downstream tools #Required + label: The value that will be displayed in downstream tools # Required type_params: # Required measure: The measure you're referencing # Required + fill_nulls_with: Set value instead of null (such as zero) # Optional ``` @@ -50,13 +52,16 @@ If you've already defined the measure using the `create_metric: true` parameter, type: simple # Pointers to a measure you created in a semantic model label: Count of customers type_params: - measure: customers # The measure youre creating a proxy of. + fills_nulls_with: 0 + measure: customers # The measure you're creating a proxy of. - name: large_orders description: "Order with order values over 20." type: SIMPLE label: Large Orders type_params: + fill_nulls_with: 0 measure: orders filter: | # For any metric you can optionally include a filter on dimension values {{Dimension('customer__order_total_dim')}} >= 20 ``` + diff --git a/website/docs/docs/cloud/cloud-cli-installation.md b/website/docs/docs/cloud/cloud-cli-installation.md index 7d459cdd91d..edf6511d4b8 100644 --- a/website/docs/docs/cloud/cloud-cli-installation.md +++ b/website/docs/docs/cloud/cloud-cli-installation.md @@ -150,7 +150,7 @@ If you already have dbt Core installed, the dbt Cloud CLI may conflict. Here are - **Prevent conflicts**
Use both the dbt Cloud CLI and dbt Core with `pip` and create a new virtual environment.

- **Use both dbt Cloud CLI and dbt Core with brew or native installs**
If you use Homebrew, consider aliasing the dbt Cloud CLI as "dbt-cloud" to avoid conflict. For more details, check the [FAQs](#faqs) if your operating system experiences path conflicts.

-- **Reverting back to dbt Core from the dbt Cloud CLI**
+- **Reverting to dbt Core from the dbt Cloud CLI**
If you've already installed the dbt Cloud CLI and need to switch back to dbt Core:
- Uninstall the dbt Cloud CLI using the command: `pip uninstall dbt` - Reinstall dbt Core using the following command, replacing "adapter_name" with the appropriate adapter name: @@ -223,7 +223,7 @@ During the public preview period, we recommend updating before filing a bug repo -To update the dbt Cloud CLI, run `brew upgrade dbt`. (You can also use `brew install dbt`). +To update the dbt Cloud CLI, run `brew update` and then `brew upgrade dbt`. @@ -235,7 +235,7 @@ To update, follow the same process explained in [Windows](/docs/cloud/cloud-cli- -To update, follow the same process explained in [Windows](/docs/cloud/cloud-cli-installation?install=linux#install-dbt-cloud-cli) and replace the existing `dbt` executable with the new one. +To update, follow the same process explained in [Linux](/docs/cloud/cloud-cli-installation?install=linux#install-dbt-cloud-cli) and replace the existing `dbt` executable with the new one. @@ -251,10 +251,14 @@ To update: ## Using VS Code extensions -Visual Studio (VS) Code extensions enhance command line tools by adding extra functionalities. The dbt Cloud CLI is fully compatible with dbt Core, however it doesn't support some dbt Core APIs required by certain tools, for example VS Code extensions. +Visual Studio (VS) Code extensions enhance command line tools by adding extra functionalities. The dbt Cloud CLI is fully compatible with dbt Core, however, it doesn't support some dbt Core APIs required by certain tools, for example, VS Code extensions. -To use these extensions, such as dbt-power-user, with the dbt Cloud CLI, you can install it using Homebrew (along with dbt Core) and create an alias to run the dbt Cloud CLI as `dbt-cloud`. This allows dbt-power-user to continue to invoke dbt Core under the hood, alongside the dbt Cloud CLI. +You can use extensions like [dbt-power-user](https://www.dbt-power-user.com/) with the dbt Cloud CLI by following these steps: +- [Install](/docs/cloud/cloud-cli-installation?install=brew) it using Homebrew along with dbt Core. +- [Create an alias](#faqs) to run the dbt Cloud CLI as `dbt-cloud`. + +This setup allows dbt-power-user to continue to work with dbt Core in the background, alongside the dbt Cloud CLI. ## FAQs diff --git a/website/docs/docs/cloud/configure-cloud-cli.md b/website/docs/docs/cloud/configure-cloud-cli.md index d6fca00cf25..a442a6e6ad1 100644 --- a/website/docs/docs/cloud/configure-cloud-cli.md +++ b/website/docs/docs/cloud/configure-cloud-cli.md @@ -66,9 +66,8 @@ Once you install the dbt Cloud CLI, you need to configure it to connect to a dbt ```yaml # dbt_project.yml name: - version: - ... + # Your project configs... dbt-cloud: project-id: PROJECT_ID @@ -86,6 +85,7 @@ To set environment variables in the dbt Cloud CLI for your dbt project: 2. Then select **Profile Settings**, then **Credentials**. 3. Click on your project and scroll to the **Environment Variables** section. 4. Click **Edit** on the lower right and then set the user-level environment variables. + - Note, when setting up the [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl), using [environment variables](/docs/build/environment-variables) like `{{env_var('DBT_WAREHOUSE')}}` is not supported. You should use the actual credentials instead. ## Use the dbt Cloud CLI diff --git a/website/docs/docs/cloud/connect-data-platform/connect-snowflake.md b/website/docs/docs/cloud/connect-data-platform/connect-snowflake.md index 5f1c4cae725..c265529fb49 100644 --- a/website/docs/docs/cloud/connect-data-platform/connect-snowflake.md +++ b/website/docs/docs/cloud/connect-data-platform/connect-snowflake.md @@ -42,10 +42,12 @@ alter user jsmith set rsa_public_key='MIIBIjANBgkqh...'; ``` 2. Finally, set the **Private Key** and **Private Key Passphrase** fields in the **Credentials** page to finish configuring dbt Cloud to authenticate with Snowflake using a key pair. - - **Note:** At this time ONLY Encrypted Private Keys are supported by dbt Cloud, and the keys must be of size 4096 or smaller. -3. To successfully fill in the Private Key field, you **must** include commented lines when you add the passphrase. Leaving the **Private Key Passphrase** field empty will return an error. If you're receiving a `Could not deserialize key data` or `JWT token` error, refer to [Troubleshooting](#troubleshooting) for more info. +**Note:** Unencrypted private keys are permitted. Use a passphrase only if needed. +As of dbt version 1.5.0, you can use a `private_key` string in place of `private_key_path`. This `private_key` string can be either Base64-encoded DER format for the key bytes or plain-text PEM format. For more details on key generation, refer to the [Snowflake documentation](https://community.snowflake.com/s/article/How-to-configure-Snowflake-key-pair-authentication-fields-in-dbt-connection). + + +4. To successfully fill in the Private Key field, you _must_ include commented lines. If you receive a `Could not deserialize key data` or `JWT token` error, refer to [Troubleshooting](#troubleshooting) for more info. **Example:** diff --git a/website/docs/docs/cloud/dbt-cloud-ide/ide-user-interface.md b/website/docs/docs/cloud/dbt-cloud-ide/ide-user-interface.md index 2038d4ad64c..8a549e40736 100644 --- a/website/docs/docs/cloud/dbt-cloud-ide/ide-user-interface.md +++ b/website/docs/docs/cloud/dbt-cloud-ide/ide-user-interface.md @@ -10,7 +10,7 @@ The [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) is a tool fo This page offers comprehensive definitions and terminology of user interface elements, allowing you to navigate the IDE landscape with ease. - + ## Basic layout @@ -36,7 +36,7 @@ The IDE streamlines your workflow, and features a popular user interface layout * Added (A) — The IDE detects added files * Deleted (D) — The IDE detects deleted files. - + 5. **Command bar —** The Command bar, located in the lower left of the IDE, is used to invoke [dbt commands](/reference/dbt-commands). When a command is invoked, the associated logs are shown in the Invocation History Drawer. @@ -49,7 +49,7 @@ The IDE streamlines your workflow, and features a popular user interface layout The IDE features some delightful tools and layouts to make it easier for you to write dbt code and collaborate with teammates. - + 1. **File Editor —** The File Editor is where users edit code. Tabs break out the region for each opened file, and unsaved files are marked with a blue dot icon in the tab view. @@ -66,24 +66,24 @@ The IDE features some delightful tools and layouts to make it easier for you to ## Additional editing features - **Minimap —** A Minimap (code outline) gives you a high-level overview of your source code, which is useful for quick navigation and code understanding. A file's minimap is displayed on the upper-right side of the editor. To quickly jump to different sections of your file, click the shaded area. - + - **dbt Editor Command Palette —** The dbt Editor Command Palette displays text editing actions and their associated keyboard shortcuts. This can be accessed by pressing `F1` or right-clicking in the text editing area and selecting Command Palette. - + - **Git Diff View —** Clicking on a file in the **Changes** section of the **Version Control Menu** will open the changed file with Git Diff view. The editor will show the previous version on the left and the in-line changes made on the right. - + - **Markdown Preview console tab —** The Markdown Preview console tab shows a preview of your .md file's markdown code in your repository and updates it automatically as you edit your code. - + - **CSV Preview console tab —** The CSV Preview console tab displays the data from your CSV file in a table, which updates automatically as you edit the file in your seed directory. - + ## Console section The console section, located below the File editor, includes various console tabs and buttons to help you with tasks such as previewing, compiling, building, and viewing the . Refer to the following sub-bullets for more details on the console tabs and buttons. - + 1. **Preview button —** When you click on the Preview button, it runs the SQL in the active file editor regardless of whether you have saved it or not and sends the results to the **Results** console tab. You can preview a selected portion of saved or unsaved code by highlighting it and then clicking the **Preview** button. @@ -107,17 +107,17 @@ Starting from dbt v1.6 or higher, when you save changes to a model, you can comp 3. **Format button —** The editor has a **Format** button that can reformat the contents of your files. For SQL files, it uses either `sqlfmt` or `sqlfluff`, and for Python files, it uses `black`. 5. **Results tab —** The Results console tab displays the most recent Preview results in tabular format. - + 6. **Compiled Code tab —** The Compile button triggers a compile invocation that generates compiled code, which is displayed in the Compiled Code tab. - + 7. **Lineage tab —** The Lineage tab in the File Editor displays the active model's lineage or . By default, it shows two degrees of lineage in both directions (`2+model_name+2`), however, you can change it to +model+ (full DAG). - Double-click a node in the DAG to open that file in a new tab - Expand or shrink the DAG using node selection syntax. - Note, the `--exclude` flag isn't supported. - + ## Invocation history @@ -128,7 +128,7 @@ You can open the drawer in multiple ways: - Typing a dbt command and pressing enter - Or pressing Control-backtick (or Ctrl + `) - + 1. **Invocation History list —** The left-hand panel of the Invocation History Drawer displays a list of previous invocations in the IDE, including the command, branch name, command status, and elapsed time. @@ -138,7 +138,7 @@ You can open the drawer in multiple ways: 4. **Command Control button —** Use the Command Control button, located on the right side, to control your invocation and cancel or rerun a selected run. - + 5. **Node Summary tab —** Clicking on the Results Status Tabs will filter the Node Status List based on their corresponding status. The available statuses are Pass (successful invocation of a node), Warn (test executed with a warning), Error (database error or test failure), Skip (nodes not run due to upstream error), and Queued (nodes that have not executed yet). @@ -150,25 +150,25 @@ You can open the drawer in multiple ways: ## Modals and Menus Use menus and modals to interact with IDE and access useful options to help your development workflow. -- **Editor tab menu —** To interact with open editor tabs, right-click any tab to access the helpful options in the file tab menu. +- **Editor tab menu —** To interact with open editor tabs, right-click any tab to access the helpful options in the file tab menu. - **File Search —** You can easily search for and navigate between files using the File Navigation menu, which can be accessed by pressing Command-O or Control-O or clicking on the 🔍 icon in the File Explorer. - + - **Global Command Palette—** The Global Command Palette provides helpful shortcuts to interact with the IDE, such as git actions, specialized dbt commands, and compile, and preview actions, among others. To open the menu, use Command-P or Control-P. - + - **IDE Status modal —** The IDE Status modal shows the current error message and debug logs for the server. This also contains an option to restart the IDE. Open this by clicking on the IDE Status button. - + - **Commit Changes modal —** The Commit Changes modal is accessible via the Git Actions button to commit all changes or via the Version Control Options menu to commit individual changes. Once you enter a commit message, you can use the modal to commit and sync the selected changes. - + - **Change Branch modal —** The Change Branch modal allows users to switch git branches in the IDE. It can be accessed through the `Change Branch` link or the Git Actions button in the Version Control menu. - + - **Revert Uncommitted Changes modal —** The Revert Uncommitted Changes modal is how users revert changes in the IDE. This is accessible via the `Revert File` option above the Version Control Options menu, or via the Git Actions button when there are saved, uncommitted changes in the IDE. - + - **IDE Options menu —** The IDE Options menu can be accessed by clicking on the three-dot menu located at the bottom right corner of the IDE. This menu contains global options such as: diff --git a/website/docs/docs/cloud/dbt-cloud-ide/keyboard-shortcuts.md b/website/docs/docs/cloud/dbt-cloud-ide/keyboard-shortcuts.md index 121cab68ce7..61fe47a235a 100644 --- a/website/docs/docs/cloud/dbt-cloud-ide/keyboard-shortcuts.md +++ b/website/docs/docs/cloud/dbt-cloud-ide/keyboard-shortcuts.md @@ -13,14 +13,14 @@ Use this dbt Cloud IDE page to help you quickly reference some common operation |--------|----------------|------------------| | View a full list of editor shortcuts | Fn-F1 | Fn-F1 | | Select a file to open | Command-O | Control-O | -| Open the command palette to invoke dbt commands and actions | Command-P or Command-Shift-P | Control-P or Control-Shift-P | -| Multi-edit by selecting multiple lines | Option-click or Shift-Option-Command | Hold Alt and click | +| Close currently active editor tab | Option-W | Alt-W | | Preview code | Command-Enter | Control-Enter | | Compile code | Command-Shift-Enter | Control-Shift-Enter | -| Reveal a list of dbt functions | Enter two underscores `__` | Enter two underscores `__` | -| Toggle open the [Invocation history drawer](/docs/cloud/dbt-cloud-ide/ide-user-interface#invocation-history) located on the bottom of the IDE. | Control-backtick (or Control + `) | Control-backtick (or Ctrl + `) | -| Add a block comment to selected code. SQL files will use the Jinja syntax `({# #})` rather than the SQL one `(/* */)`.

Markdown files will use the Markdown syntax `()` | Command-Option-/ | Control-Alt-/ | -| Close the currently active editor tab | Option-W | Alt-W | +| Reveal a list of dbt functions in the editor | Enter two underscores `__` | Enter two underscores `__` | +| Open the command palette to invoke dbt commands and actions | Command-P / Command-Shift-P | Control-P / Control-Shift-P | +| Multi-edit in the editor by selecting multiple lines | Option-Click / Shift-Option-Command / Shift-Option-Click | Hold Alt and Click | +| Open the [**Invocation History Drawer**](/docs/cloud/dbt-cloud-ide/ide-user-interface#invocation-history) located at the bottom of the IDE. | Control-backtick (or Control + `) | Control-backtick (or Ctrl + `) | +| Add a block comment to the selected code. SQL files will use the Jinja syntax `({# #})` rather than the SQL one `(/* */)`.

Markdown files will use the Markdown syntax `()` | Command-Option-/ | Control-Alt-/ | ## Related docs diff --git a/website/docs/docs/cloud/dbt-cloud-ide/lint-format.md b/website/docs/docs/cloud/dbt-cloud-ide/lint-format.md index 733ec9dbcfe..37d8c8d814e 100644 --- a/website/docs/docs/cloud/dbt-cloud-ide/lint-format.md +++ b/website/docs/docs/cloud/dbt-cloud-ide/lint-format.md @@ -14,7 +14,7 @@ Linters analyze code for errors, bugs, and style issues, while formatters fix st -In the dbt Cloud IDE, you have the capability to perform linting, auto-fix, and formatting on five different file types: +In the dbt Cloud IDE, you can perform linting, auto-fix, and formatting on five different file types: - SQL — [Lint](#lint) and fix with SQLFluff, and [format](#format) with sqlfmt - YAML, Markdown, and JSON — Format with Prettier @@ -63,7 +63,7 @@ Linting doesn't support ephemeral models in dbt v1.5 and lower. Refer to the [FA - **Fix** button — Automatically fixes linting errors in the **File editor**. When fixing is complete, you'll see a message confirming the outcome. - Use the **Code Quality** tab to view and debug any code errors. - + ### Customize linting @@ -130,7 +130,7 @@ group_by_and_order_by_style = implicit For more info on styling best practices, refer to [How we style our SQL](/best-practices/how-we-style/2-how-we-style-our-sql). ::: - + ## Format @@ -146,7 +146,7 @@ The Cloud IDE formatting integrations take care of manual tasks like code format To format your SQL code, dbt Cloud integrates with [sqlfmt](http://sqlfmt.com/), which is an uncompromising SQL query formatter that provides one way to format the SQL query and Jinja. -By default, the IDE uses sqlfmt rules to format your code, making the **Format** button available and convenient to use right away. However, if you have a file named .sqlfluff in the root directory of your dbt project, the IDE will default to SQLFluff rules instead. +By default, the IDE uses sqlfmt rules to format your code, making the **Format** button available and convenient to use immediately. However, if you have a file named .sqlfluff in the root directory of your dbt project, the IDE will default to SQLFluff rules instead. To enable sqlfmt: @@ -158,7 +158,7 @@ To enable sqlfmt: 6. Once you've selected the **sqlfmt** radio button, go to the console section (located below the **File editor**) to select the **Format** button. 7. The **Format** button auto-formats your code in the **File editor**. Once you've auto-formatted, you'll see a message confirming the outcome. - + ### Format YAML, Markdown, JSON @@ -169,7 +169,7 @@ To format your YAML, Markdown, or JSON code, dbt Cloud integrates with [Prettier 3. In the console section (located below the **File editor**), select the **Format** button to auto-format your code in the **File editor**. Use the **Code Quality** tab to view code errors. 4. Once you've auto-formatted, you'll see a message confirming the outcome. - + You can add a configuration file to customize formatting rules for YAML, Markdown, or JSON files using Prettier. The IDE looks for the configuration file based on an order of precedence. For example, it first checks for a "prettier" key in your `package.json` file. @@ -185,14 +185,12 @@ To format your Python code, dbt Cloud integrates with [Black](https://black.read 3. In the console section (located below the **File editor**), select the **Format** button to auto-format your code in the **File editor**. 4. Once you've auto-formatted, you'll see a message confirming the outcome. - + ## FAQs -
-When should I use SQLFluff and when should I use sqlfmt? - -SQLFluff and sqlfmt are both tools used for formatting SQL code, but there are some differences that may make one preferable to the other depending on your use case.
+ +SQLFluff and sqlfmt are both tools used for formatting SQL code, but some differences may make one preferable to the other depending on your use case.
SQLFluff is a SQL code linter and formatter. This means that it analyzes your code to identify potential issues and bugs, and follows coding standards. It also formats your code according to a set of rules, which are [customizable](#customize-linting), to ensure consistent coding practices. You can also use SQLFluff to keep your SQL code well-formatted and follow styling best practices.
@@ -204,34 +202,37 @@ You can use either SQLFluff or sqlfmt depending on your preference and what work - Use sqlfmt to only have your code well-formatted without analyzing it for errors and bugs. You can use sqlfmt out of the box, making it convenient to use right away without having to configure it. -
+ -
-Can I nest .sqlfluff files? + To ensure optimal code quality, consistent code, and styles — it's highly recommended you have one main `.sqlfluff` configuration file in the root folder of your project. Having multiple files can result in various different SQL styles in your project.

However, you can customize and include an additional child `.sqlfluff` configuration file within specific subfolders of your dbt project.

By nesting a `.sqlfluff` file in a subfolder, SQLFluff will apply the rules defined in that subfolder's configuration file to any files located within it. The rules specified in the parent `.sqlfluff` file will be used for all other files and folders outside of the subfolder. This hierarchical approach allows for tailored linting rules while maintaining consistency throughout your project. Refer to [SQLFluff documentation](https://docs.sqlfluff.com/en/stable/configuration.html#configuration-files) for more info. -
+ -
-Can I run SQLFluff commands from the terminal? + Currently, running SQLFluff commands from the terminal isn't supported. -
+ -
-Why am I unable to see the Lint or Format button? + Make sure you're on a development branch. Formatting or Linting isn't available on "main" or "read-only" branches. -
+ -
-Why is there inconsistent SQLFluff behavior when running outside the dbt Cloud IDE (such as a GitHub Action)? -— Double-check your SQLFluff version matches the one in dbt Cloud IDE (found in the Code Quality tab after a lint operation).

-— If your lint operation passes despite clear rule violations, confirm you're not linting models with ephemeral models. Linting doesn't support ephemeral models in dbt v1.5 and lower. -
+ +- Double-check that your SQLFluff version matches the one in dbt Cloud IDE (found in the Code Quality tab after a lint operation).

+- If your lint operation passes despite clear rule violations, confirm you're not linting models with ephemeral models. Linting doesn't support ephemeral models in dbt v1.5 and lower. +
+ + +Currently, the dbt Cloud IDE can lint or fix files up to a certain size and complexity. If you attempt to lint or fix files that are too large, taking more than 60 seconds for the dbt Cloud backend to process, you will see an 'Unable to complete linting this file' error. + +To avoid this, break up your model into smaller models (files) so that they are less complex to lint or fix. Note that linting is simpler than fixing so there may be cases where a file can be linted but not fixed. + + ## Related docs diff --git a/website/docs/docs/cloud/manage-access/cloud-seats-and-users.md b/website/docs/docs/cloud/manage-access/cloud-seats-and-users.md index 63786f40bd8..e1fe83a24f2 100644 --- a/website/docs/docs/cloud/manage-access/cloud-seats-and-users.md +++ b/website/docs/docs/cloud/manage-access/cloud-seats-and-users.md @@ -11,7 +11,7 @@ In dbt Cloud, _licenses_ are used to allocate users to your account. There are t - **Developer** — Granted access to the Deployment and [Development](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) functionality in dbt Cloud. - **Read-Only** — Intended to view the [artifacts](/docs/deploy/artifacts) created in a dbt Cloud account. Read-Only users can receive job notifications but not configure them. -- **IT** — Can manage users, groups, and licenses, among other permissions. IT users can receive job notifications but not configure them. Available on Enterprise and Team plans only. +- **IT** — Can manage users, groups, and licenses, among other permissions. IT users can receive job notifications but not configure them. Available on Enterprise and Team plans only. In Enterprise plans, the IT license type grants access equivalent to the ['Security admin' and 'Billing admin' roles](/docs/cloud/manage-access/enterprise-permissions#account-permissions-for-account-roles). The user's assigned license determines the specific capabilities they can access in dbt Cloud. @@ -29,6 +29,12 @@ The user's assigned license determines the specific capabilities they can access ## Licenses +:::tip Licenses or Permission sets + +The user's license type always overrides their assigned [Enterprise permission](/docs/cloud/manage-access/enterprise-permissions) set. This means that even if a user belongs to a dbt Cloud group with 'Account Admin' permissions, having a 'Read-Only' license would still prevent them from performing administrative actions on the account. + +::: + Each dbt Cloud plan comes with a base number of Developer, IT, and Read-Only licenses. You can add or remove licenses by modifying the number of users in your account settings. If you have a Developer plan account and want to add more people to your team, you'll need to upgrade to the Team plan. Refer to [dbt Pricing Plans](https://www.getdbt.com/pricing/) for more information about licenses available with each plan. @@ -130,8 +136,7 @@ to allocate for the user. If your account does not have an available license to allocate, you will need to add more licenses to your plan to complete the license change. - + ### Mapped configuration diff --git a/website/docs/docs/cloud/manage-access/enterprise-permissions.md b/website/docs/docs/cloud/manage-access/enterprise-permissions.md index dcacda20deb..4ed7ab228e5 100644 --- a/website/docs/docs/cloud/manage-access/enterprise-permissions.md +++ b/website/docs/docs/cloud/manage-access/enterprise-permissions.md @@ -20,6 +20,11 @@ control (RBAC). The following roles and permission sets are available for assignment in dbt Cloud Enterprise accounts. They can be granted to dbt Cloud groups which are then in turn granted to users. A dbt Cloud group can be associated with more than one role and permission set. Roles with more access take precedence. +:::tip Licenses or Permission sets + +The user's [license](/docs/cloud/manage-access/seats-and-users) type always overrides their assigned permission set. This means that even if a user belongs to a dbt Cloud group with 'Account Admin' permissions, having a 'Read-Only' license would still prevent them from performing administrative actions on the account. +::: + ## How to set up RBAC Groups in dbt Cloud diff --git a/website/docs/docs/cloud/manage-access/set-up-bigquery-oauth.md b/website/docs/docs/cloud/manage-access/set-up-bigquery-oauth.md index 87018b14d56..f717bf3a5b1 100644 --- a/website/docs/docs/cloud/manage-access/set-up-bigquery-oauth.md +++ b/website/docs/docs/cloud/manage-access/set-up-bigquery-oauth.md @@ -77,4 +77,5 @@ Select **Allow**. This redirects you back to dbt Cloud. You should now be an aut ## FAQs - + + diff --git a/website/docs/docs/cloud/secure/about-privatelink.md b/website/docs/docs/cloud/secure/about-privatelink.md index 2134ab25cfe..731cef3f019 100644 --- a/website/docs/docs/cloud/secure/about-privatelink.md +++ b/website/docs/docs/cloud/secure/about-privatelink.md @@ -6,10 +6,11 @@ sidebar_label: "About PrivateLink" --- import SetUpPages from '/snippets/_available-tiers-privatelink.md'; +import PrivateLinkHostnameWarning from '/snippets/_privatelink-hostname-restriction.md'; -PrivateLink enables a private connection from any dbt Cloud Multi-Tenant environment to your data platform hosted on AWS using [AWS PrivateLink](https://aws.amazon.com/privatelink/) technology. PrivateLink allows dbt Cloud customers to meet security and compliance controls as it allows connectivity between dbt Cloud and your data platform without traversing the public internet. This feature is supported in most regions across NA, Europe, and Asia, but [contact us](https://www.getdbt.com/contact/) if you have questions about availability. +PrivateLink enables a private connection from any dbt Cloud Multi-Tenant environment to your data platform hosted on AWS using [AWS PrivateLink](https://aws.amazon.com/privatelink/) technology. PrivateLink allows dbt Cloud customers to meet security and compliance controls as it allows connectivity between dbt Cloud and your data platform without traversing the public internet. This feature is supported in most regions across NA, Europe, and Asia, but [contact us](https://www.getdbt.com/contact/) if you have questions about availability. ### Cross-region PrivateLink @@ -24,3 +25,5 @@ dbt Cloud supports the following data platforms for use with the PrivateLink fea - [Redshift](/docs/cloud/secure/redshift-privatelink) - [Postgres](/docs/cloud/secure/postgres-privatelink) - [VCS](/docs/cloud/secure/vcs-privatelink) + + diff --git a/website/docs/docs/collaborate/cloud-build-and-view-your-docs.md b/website/docs/docs/collaborate/cloud-build-and-view-your-docs.md index e104ea8640c..0129b43f305 100644 --- a/website/docs/docs/collaborate/cloud-build-and-view-your-docs.md +++ b/website/docs/docs/collaborate/cloud-build-and-view-your-docs.md @@ -16,7 +16,7 @@ To set up a job to generate docs: 1. In the top left, click **Deploy** and select **Jobs**. 2. Create a new job or select an existing job and click **Settings**. 3. Under "Execution Settings," select **Generate docs on run**. - + 4. Click **Save**. Proceed to [configure project documentation](#configure-project-documentation) so your project generates the documentation when this job runs. @@ -44,7 +44,7 @@ You configure project documentation to generate documentation when the job you s 3. Navigate to **Projects** and select the project that needs documentation. 4. Click **Edit**. 5. Under **Artifacts**, select the job that should generate docs when it runs. - + 6. Click **Save**. ## Generating documentation @@ -52,7 +52,7 @@ You configure project documentation to generate documentation when the job you s To generate documentation in the dbt Cloud IDE, run the `dbt docs generate` command in the Command Bar in the dbt Cloud IDE. This command will generate the Docs for your dbt project as it exists in development in your IDE session. - + After generating your documentation, you can click the **Book** icon above the file tree, to see the latest version of your documentation rendered in a new browser window. @@ -65,4 +65,4 @@ These generated docs always show the last fully successful run, which means that The dbt Cloud IDE makes it possible to view [documentation](/docs/collaborate/documentation) for your dbt project while your code is still in development. With this workflow, you can inspect and verify what your project's generated documentation will look like before your changes are released to production. - + diff --git a/website/docs/docs/collaborate/explore-multiple-projects.md b/website/docs/docs/collaborate/explore-multiple-projects.md index 3be35110a37..2ec7f573957 100644 --- a/website/docs/docs/collaborate/explore-multiple-projects.md +++ b/website/docs/docs/collaborate/explore-multiple-projects.md @@ -11,7 +11,7 @@ The resource-level lineage graph for a given project displays the cross-project When you view an upstream (parent) project, its public models display a counter icon in the upper right corner indicating how many downstream (child) projects depend on them. Selecting a model reveals the lineage indicating the projects dependent on that model. These counts include all projects listing the upstream one as a dependency in its `dependencies.yml`, even without a direct `{{ ref() }}`. Selecting a project node from a public model opens its detailed lineage graph, which is subject to your [permission](/docs/cloud/manage-access/enterprise-permissions). - + When viewing a downstream (child) project that imports and refs public models from upstream (parent) projects, public models will show up in the lineage graph and display an icon on the graph edge that indicates what the relationship is to a model from another project. Hovering over this icon indicates the specific dbt Cloud project that produces that model. Double-clicking on a model from another project opens the resource-level lineage graph of the parent project, which is subject to your permissions. @@ -43,4 +43,4 @@ When you select a project node in the graph, a project details panel opens on th - Click **Open Project Lineage** to switch to the project’s lineage graph. - Click the Share icon to copy the project panel link to your clipboard so you can share the graph with someone. - \ No newline at end of file + diff --git a/website/docs/docs/collaborate/govern/model-contracts.md b/website/docs/docs/collaborate/govern/model-contracts.md index e3ea1e8c70c..09389036513 100644 --- a/website/docs/docs/collaborate/govern/model-contracts.md +++ b/website/docs/docs/collaborate/govern/model-contracts.md @@ -28,10 +28,18 @@ While this is ideal for quick and iterative development, for some models, consta ## Where are contracts supported? At present, model contracts are supported for: -- SQL models. Contracts are not yet supported for Python models. -- Models materialized as `table`, `view`, and `incremental` (with `on_schema_change: append_new_columns`). Views offer limited support for column names and data types, but not `constraints`. Contracts are not supported for `ephemeral`-materialized models. +- SQL models. +- Models materialized as one of the following: + - `table` + - `view` — Views offer limited support for column names and data types, but not `constraints`. + - `incremental` — with `on_schema_change: append_new_columns` or `on_schema_change: fail`. - Certain data platforms, but the supported and enforced `constraints` vary by platform. +Model contracts are _not_ supported for: +- Python models. +- `ephemeral`-materialized SQL models. + + ## How to define a contract Let's say you have a model with a query like: diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index 569d69a87e6..80dee650698 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -4,16 +4,16 @@ id: project-dependencies sidebar_label: "Project dependencies" description: "Reference public models across dbt projects" pagination_next: null +keyword: dbt mesh, project dependencies, ref, cross project ref, project dependencies --- :::info Available in Public Preview for dbt Cloud Enterprise accounts Project dependencies and cross-project `ref` are features available in [dbt Cloud Enterprise](https://www.getdbt.com/pricing), currently in [Public Preview](/docs/dbt-versions/product-lifecycles#dbt-cloud). -Enterprise users can use these features by designating a [public model](/docs/collaborate/govern/model-access) and adding a [cross-project ref](#how-to-use-ref). +If you have an [Enterprise account](https://www.getdbt.com/pricing), you can unlock these features by designating a [public model](/docs/collaborate/govern/model-access) and adding a [cross-project ref](#how-to-write-cross-project-ref). ::: - For a long time, dbt has supported code reuse and extension by installing other projects as [packages](/docs/build/packages). When you install another project as a package, you are pulling in its full source code, and adding it to your own. This enables you to call macros and run models defined in that other project. While this is a great way to reuse code, share utility macros, and establish a starting point for common transformations, it's not a great way to enable collaboration across teams and at scale, especially at larger organizations. @@ -80,9 +80,9 @@ When you're building on top of another team's work, resolving the references in - You don't need to mirror any conditional configuration of the upstream project such as `vars`, environment variables, or `target.name`. You can reference them directly wherever the Finance team is building their models in production. Even if the Finance team makes changes like renaming the model, changing the name of its schema, or [bumping its version](/docs/collaborate/govern/model-versions), your `ref` would still resolve successfully. - You eliminate the risk of accidentally building those models with `dbt run` or `dbt build`. While you can select those models, you can't actually build them. This prevents unexpected warehouse costs and permissions issues. This also ensures proper ownership and cost allocation for each team's models. -### How to use ref +### How to write cross-project ref -**Writing `ref`:** Models referenced from a `project`-type dependency must use [two-argument `ref`](/reference/dbt-jinja-functions/ref#two-argument-variant), including the project name: +**Writing `ref`:** Models referenced from a `project`-type dependency must use [two-argument `ref`](/reference/dbt-jinja-functions/ref#ref-project-specific-models), including the project name: diff --git a/website/docs/docs/collaborate/model-performance.md b/website/docs/docs/collaborate/model-performance.md index 7ef675b4e1e..5b3b4228210 100644 --- a/website/docs/docs/collaborate/model-performance.md +++ b/website/docs/docs/collaborate/model-performance.md @@ -27,7 +27,7 @@ Each data point links to individual models in Explorer. You can view historical metadata for up to the past three months. Select the time horizon using the filter, which defaults to a two-week lookback. - + ## The Model performance tab @@ -38,4 +38,4 @@ You can view trends in execution times, counts, and failures by using the Model Clicking on a data point reveals a table listing all job runs for that day, with each row providing a direct link to the details of a specific run. - \ No newline at end of file + diff --git a/website/docs/docs/community-adapters.md b/website/docs/docs/community-adapters.md index d1e63f03128..1faf2fd9e25 100644 --- a/website/docs/docs/community-adapters.md +++ b/website/docs/docs/community-adapters.md @@ -17,4 +17,4 @@ Community adapters are adapter plugins contributed and maintained by members of | [TiDB](/docs/core/connect-data-platform/tidb-setup) | [Firebolt](/docs/core/connect-data-platform/firebolt-setup) | [MindsDB](/docs/core/connect-data-platform/mindsdb-setup) | [Vertica](/docs/core/connect-data-platform/vertica-setup) | [AWS Glue](/docs/core/connect-data-platform/glue-setup) | [MySQL](/docs/core/connect-data-platform/mysql-setup) | | [Upsolver](/docs/core/connect-data-platform/upsolver-setup) | [Databend Cloud](/docs/core/connect-data-platform/databend-setup) | [fal - Python models](/docs/core/connect-data-platform/fal-setup) | -| [TimescaleDB](https://dbt-timescaledb.debruyn.dev/) | | | +| [TimescaleDB](https://dbt-timescaledb.debruyn.dev/) | [Extrica](/docs/core/connect-data-platform/extrica-setup) | | diff --git a/website/docs/docs/core/connect-data-platform/connection-profiles.md b/website/docs/docs/core/connect-data-platform/connection-profiles.md index 8088ff1dfa7..32e60c8cc18 100644 --- a/website/docs/docs/core/connect-data-platform/connection-profiles.md +++ b/website/docs/docs/core/connect-data-platform/connection-profiles.md @@ -83,11 +83,8 @@ To set up your profile, copy the correct sample profile for your warehouse into You can find more information on which values to use in your targets below. -:::info Validating your warehouse credentials +Use the [debug](/reference/dbt-jinja-functions/debug-method) command to validate your warehouse connection. Run `dbt debug` from within a dbt project to test your connection. -Use the [debug](/reference/dbt-jinja-functions/debug-method) command to check whether you can successfully connect to your warehouse. Simply run `dbt debug` from within a dbt project to test your connection. - -::: ## Understanding targets in profiles diff --git a/website/docs/docs/core/connect-data-platform/dremio-setup.md b/website/docs/docs/core/connect-data-platform/dremio-setup.md index 839dd8cffa8..21d0ee2956b 100644 --- a/website/docs/docs/core/connect-data-platform/dremio-setup.md +++ b/website/docs/docs/core/connect-data-platform/dremio-setup.md @@ -15,12 +15,6 @@ meta: config_page: '/reference/resource-configs/no-configs' --- -:::info Vendor plugin - -Some core functionality may be limited. If you're interested in contributing, check out the source code for each repository listed below. - -::: - import SetUpPages from '/snippets/_setup-pages-intro.md'; diff --git a/website/docs/docs/core/connect-data-platform/extrica-setup.md b/website/docs/docs/core/connect-data-platform/extrica-setup.md new file mode 100644 index 00000000000..8125e6e3749 --- /dev/null +++ b/website/docs/docs/core/connect-data-platform/extrica-setup.md @@ -0,0 +1,80 @@ +--- +title: "Extrica Setup" +description: "Read this guide to learn about the Extrica Trino Query Engine setup in dbt." +id: "extrica-setup" +meta: + maintained_by: Extrica, Trianz + authors: Gaurav Mittal, Viney Kumar, Mohammed Feroz, and Mrinal Mayank + github_repo: 'extricatrianz/dbt-extrica' + pypi_package: 'dbt-extrica' + min_core_version: 'v1.7.2' + cloud_support: 'Not Supported' + min_supported_version: 'n/a' + platform_name: 'Extrica' +--- +

Overview of {frontMatter.meta.pypi_package}

+ +
    +
  • Maintained by: {frontMatter.meta.maintained_by}
  • +
  • Authors: {frontMatter.meta.authors}
  • +
  • GitHub repo: {frontMatter.meta.github_repo}
  • +
  • PyPI package: {frontMatter.meta.pypi_package}
  • +
  • Supported dbt Core version: {frontMatter.meta.min_core_version} and newer
  • +
  • dbt Cloud support: {frontMatter.meta.cloud_support}
  • +
  • Minimum data platform version: {frontMatter.meta.min_supported_version}
  • +
+

Installing {frontMatter.meta.pypi_package}

+ +Use `pip` to install the adapter, which automatically installs `dbt-core` and any additional dependencies. Use the following command for installation: + +python -m pip install {frontMatter.meta.pypi_package} + + +

Connecting to {frontMatter.meta.platform_name}

+ +#### Example profiles.yml +Here is an example of dbt-extrica profiles. At a minimum, you need to specify `type`, `method`, `username`, `password` `host`, `port`, `schema`, `catalog` and `threads`. + + +```yaml +: + outputs: + dev: + type: extrica + method: jwt + username: [username for jwt auth] + password: [password for jwt auth] + host: [extrica hostname] + port: [port number] + schema: [dev_schema] + catalog: [catalog_name] + threads: [1 or more] + + prod: + type: extrica + method: jwt + username: [username for jwt auth] + password: [password for jwt auth] + host: [extrica hostname] + port: [port number] + schema: [dev_schema] + catalog: [catalog_name] + threads: [1 or more] + target: dev + +``` + + +#### Description of Extrica Profile Fields + +| Parameter | Type | Description | +|------------|----------|------------------------------------------| +| type | string | Specifies the type of dbt adapter (Extrica). | +| method | jwt | Authentication method for JWT authentication. | +| username | string | Username for JWT authentication. The obtained JWT token is used to initialize a trino.auth.JWTAuthentication object. | +| password | string | Password for JWT authentication. The obtained JWT token is used to initialize a trino.auth.JWTAuthentication object. | +| host | string | The host parameter specifies the hostname or IP address of the Extrica's Trino server. | +| port | integer | The port parameter specifies the port number on which the Extrica's Trino server is listening. | +| schema | string | Schema or database name for the connection. | +| catalog | string | Name of the catalog representing the data source. | +| threads | integer | Number of threads for parallel execution of queries. (1 or more) | diff --git a/website/docs/docs/core/connect-data-platform/fabric-setup.md b/website/docs/docs/core/connect-data-platform/fabric-setup.md index deef1e04b22..5180d65ebb9 100644 --- a/website/docs/docs/core/connect-data-platform/fabric-setup.md +++ b/website/docs/docs/core/connect-data-platform/fabric-setup.md @@ -39,12 +39,15 @@ If you already have ODBC Driver 17 installed, then that one will work as well. #### Supported configurations -* The adapter is tested with Microsoft Fabric Synapse Data Warehouse. +* The adapter is tested with Microsoft Fabric Synapse Data Warehouses (also referred to as Warehouses). * We test all combinations with Microsoft ODBC Driver 17 and Microsoft ODBC Driver 18. * The collations we run our tests on are `Latin1_General_100_BIN2_UTF8`. The adapter support is not limited to the matrix of the above configurations. If you notice an issue with any other configuration, let us know by opening an issue on [GitHub](https://github.com/microsoft/dbt-fabric). +##### Unsupported configurations +SQL analytics endpoints are read-only and so are not appropriate for Transformation workloads, use a Warehouse instead. + ## Authentication methods & profile configuration ### Common configuration diff --git a/website/docs/docs/core/connect-data-platform/profiles.yml.md b/website/docs/docs/core/connect-data-platform/profiles.yml.md index f8acb65f3d2..c9a66010a50 100644 --- a/website/docs/docs/core/connect-data-platform/profiles.yml.md +++ b/website/docs/docs/core/connect-data-platform/profiles.yml.md @@ -31,6 +31,9 @@ This section identifies the parts of your `profiles.yml` that aren't specific to [fail_fast](/reference/global-configs/failing-fast): [use_experimental_parser](/reference/global-configs/parsing): [static_parser](/reference/global-configs/parsing): + [cache_selected_only](/reference/global-configs/cache): + [printer_width](/reference/global-configs/print-output#printer-width): + [log_format](/reference/global-configs/logs): : target: # this is the default target diff --git a/website/docs/docs/core/connect-data-platform/snowflake-setup.md b/website/docs/docs/core/connect-data-platform/snowflake-setup.md index 2b426ef667b..098b09d0219 100644 --- a/website/docs/docs/core/connect-data-platform/snowflake-setup.md +++ b/website/docs/docs/core/connect-data-platform/snowflake-setup.md @@ -98,9 +98,10 @@ Along with adding the `authenticator` parameter, be sure to run `alter account s ### Key Pair Authentication -To use key pair authentication, omit a `password` and instead provide a `private_key_path` and, optionally, a `private_key_passphrase` in your target. **Note:** Versions of dbt before 0.16.0 required that private keys were encrypted and a `private_key_passphrase` was provided. This behavior was changed in dbt v0.16.0. +To use key pair authentication, skip the `password` and provide a `private_key_path`. If needed, you can also add a `private_key_passphrase`. +**Note**: Unencrypted private keys are accepted, so add a passphrase only if necessary. -Starting from [dbt v1.5.0](/docs/dbt-versions/core), you have the option to use a `private_key` string instead of a `private_key_path`. The `private_key` string should be in either Base64-encoded DER format, representing the key bytes, or a plain-text PEM format. Refer to [Snowflake documentation](https://docs.snowflake.com/developer-guide/python-connector/python-connector-example#using-key-pair-authentication-key-pair-rotation) for more info on how they generate the key. +Starting from [dbt v1.5.0](/docs/dbt-versions/core), you have the option to use a `private_key` string instead of a `private_key_path`. The `private_key` string should be in either Base64-encoded DER format, representing the key bytes, or a plain-text PEM format. Refer to [Snowflake documentation](https://docs.snowflake.com/en/user-guide/key-pair-auth) for more info on how they generate the key. diff --git a/website/docs/docs/core/connect-data-platform/teradata-setup.md b/website/docs/docs/core/connect-data-platform/teradata-setup.md index 1a30a1a4a54..4f467968716 100644 --- a/website/docs/docs/core/connect-data-platform/teradata-setup.md +++ b/website/docs/docs/core/connect-data-platform/teradata-setup.md @@ -38,6 +38,7 @@ import SetUpPages from '/snippets/_setup-pages-intro.md'; |1.4.x.x | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ |1.5.x | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ |1.6.x | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ +|1.7.x | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ ## dbt dependent packages version compatibility @@ -45,6 +46,7 @@ import SetUpPages from '/snippets/_setup-pages-intro.md'; |--------------|------------|-------------------|----------------| | 1.2.x | 1.2.x | 0.1.0 | 0.9.x or below | | 1.6.7 | 1.6.7 | 1.1.1 | 1.1.1 | +| 1.7.0 | 1.7.3 | 1.1.1 | 1.1.1 | ### Connecting to Teradata @@ -172,6 +174,8 @@ For using cross DB macros, teradata-utils as a macro namespace will not be used, | Cross-database macros | type_string | :white_check_mark: | custom macro provided | | Cross-database macros | last_day | :white_check_mark: | no customization needed, see [compatibility note](#last_day) | | Cross-database macros | width_bucket | :white_check_mark: | no customization +| Cross-database macros | generate_series | :white_check_mark: | custom macro provided +| Cross-database macros | date_spine | :white_check_mark: | no customization #### examples for cross DB macros diff --git a/website/docs/docs/core/connect-data-platform/trino-setup.md b/website/docs/docs/core/connect-data-platform/trino-setup.md index bb36bb11a01..4caa56dcb00 100644 --- a/website/docs/docs/core/connect-data-platform/trino-setup.md +++ b/website/docs/docs/core/connect-data-platform/trino-setup.md @@ -30,7 +30,7 @@ The parameters for setting up a connection are for Starburst Enterprise, Starbur ## Host parameters -The following profile fields are always required except for `user`, which is also required unless you're using the `oauth`, `cert`, or `jwt` authentication methods. +The following profile fields are always required except for `user`, which is also required unless you're using the `oauth`, `oauth_console`, `cert`, or `jwt` authentication methods. | Field | Example | Description | | --------- | ------- | ----------- | @@ -71,6 +71,7 @@ The authentication methods that dbt Core supports are: - `jwt` — JSON Web Token (JWT) - `certificate` — Certificate-based authentication - `oauth` — Open Authentication (OAuth) +- `oauth_console` — Open Authentication (OAuth) with authentication URL printed to the console - `none` — None, no authentication Set the `method` field to the authentication method you intend to use for the connection. For a high-level introduction to authentication in Trino, see [Trino Security: Authentication types](https://trino.io/docs/current/security/authentication-types.html). @@ -85,6 +86,7 @@ Click on one of these authentication methods for further details on how to confi {label: 'JWT', value: 'jwt'}, {label: 'Certificate', value: 'certificate'}, {label: 'OAuth', value: 'oauth'}, + {label: 'OAuth (console)', value: 'oauth_console'}, {label: 'None', value: 'none'}, ]} > @@ -269,7 +271,36 @@ sandbox-galaxy: host: bunbundersders.trino.galaxy-dev.io catalog: dbt_target schema: dataders - port: 433 + port: 443 +``` + + + + + +The only authentication parameter to set for OAuth 2.0 is `method: oauth_console`. If you're using Starburst Enterprise or Starburst Galaxy, you must enable OAuth 2.0 in Starburst before you can use this authentication method. + +For more information, refer to both [OAuth 2.0 authentication](https://trino.io/docs/current/security/oauth2.html) in the Trino docs and the [README](https://github.com/trinodb/trino-python-client#oauth2-authentication) for the Trino Python client. + +The only difference between `oauth_console` and `oauth` is: +- `oauth` — An authentication URL automatically opens in a browser. +- `oauth_console` — A URL is printed to the console. + +It's recommended that you install `keyring` to cache the OAuth 2.0 token over multiple dbt invocations by running `python -m pip install 'trino[external-authentication-token-cache]'`. The `keyring` package is not installed by default. + +#### Example profiles.yml for OAuth + +```yaml +sandbox-galaxy: + target: oauth_console + outputs: + oauth: + type: trino + method: oauth_console + host: bunbundersders.trino.galaxy-dev.io + catalog: dbt_target + schema: dataders + port: 443 ``` diff --git a/website/docs/docs/core/connect-data-platform/vertica-setup.md b/website/docs/docs/core/connect-data-platform/vertica-setup.md index b1424289137..8e499d68b3e 100644 --- a/website/docs/docs/core/connect-data-platform/vertica-setup.md +++ b/website/docs/docs/core/connect-data-platform/vertica-setup.md @@ -6,7 +6,7 @@ meta: authors: 'Vertica (Former authors: Matthew Carter, Andy Regan, Andrew Hedengren)' github_repo: 'vertica/dbt-vertica' pypi_package: 'dbt-vertica' - min_core_version: 'v1.6.0 and newer' + min_core_version: 'v1.7.0' cloud_support: 'Not Supported' min_supported_version: 'Vertica 23.4.0' slack_channel_name: 'n/a' @@ -46,10 +46,12 @@ your-profile: username: [your username] password: [your password] database: [database name] + oauth_access_token: [access token] schema: [dbt schema] connection_load_balance: True backup_server_node: [list of backup hostnames or IPs] retries: [1 or more] + threads: [1 or more] target: dev ``` @@ -70,6 +72,7 @@ your-profile: | username | The username to use to connect to the server. | Yes | None | dbadmin| password |The password to use for authenticating to the server. |Yes|None|my_password| database |The name of the database running on the server. |Yes | None | my_db | +| oauth_access_token | To authenticate via OAuth, provide an OAuth Access Token that authorizes a user to the database. | No | "" | Default: "" | schema| The schema to build models into.| No| None |VMart| connection_load_balance| A Boolean value that indicates whether the connection can be redirected to a host in the database other than host.| No| True |True| backup_server_node| List of hosts to connect to if the primary host specified in the connection (host, port) is unreachable. Each item in the list should be either a host string (using default port 5433) or a (host, port) tuple. A host can be a host name or an IP address.| No| None |['123.123.123.123','www.abc.com',('123.123.123.124',5433)]| diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx index a82bba6576d..a89d8f31962 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-environment.mdx @@ -18,13 +18,6 @@ When querying for `environment`, you can use the following arguments. -:::caution - -dbt Labs is making changes to the Discovery API. These changes will take effect on August 15, 2023. - -The data type `Int` for `id` is being deprecated and will be replaced with `BigInt`. When the time comes, you will need to update your API call accordingly to avoid errors. -::: - ### Example queries You can use your production environment's `id`: diff --git a/website/docs/docs/dbt-cloud-apis/sl-api-overview.md b/website/docs/docs/dbt-cloud-apis/sl-api-overview.md index 6644d3e4b8b..0ddbc6888db 100644 --- a/website/docs/docs/dbt-cloud-apis/sl-api-overview.md +++ b/website/docs/docs/dbt-cloud-apis/sl-api-overview.md @@ -15,7 +15,7 @@ import DeprecationNotice from '/snippets/_sl-deprecation-notice.md'; -The rapid growth of different tools in the modern data stack has helped data professionals address the diverse needs of different teams. The downside of this growth is the fragmentation of business logic across teams, tools, and workloads. +The rapid growth of different tools in the modern data stack has helped data professionals address the diverse needs of different teams. The downside of this growth is the fragmentation of business logic across teams, tools, and workloads.
The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) allows you to define metrics in code (with [MetricFlow](/docs/build/about-metricflow)) and dynamically generate and query datasets in downstream tools based on their dbt governed assets, such as metrics and models. Integrating with the dbt Semantic Layer will help organizations that use your product make more efficient and trustworthy decisions with their data. It also helps you to avoid duplicative coding, optimize development workflow, ensure data governance, and guarantee consistency for data consumers. diff --git a/website/docs/docs/dbt-cloud-apis/sl-graphql.md b/website/docs/docs/dbt-cloud-apis/sl-graphql.md index 3555b211f4f..f26a19a1930 100644 --- a/website/docs/docs/dbt-cloud-apis/sl-graphql.md +++ b/website/docs/docs/dbt-cloud-apis/sl-graphql.md @@ -22,10 +22,25 @@ GraphQL has several advantages, such as self-documenting, having a strong typing ## dbt Semantic Layer GraphQL API -The dbt Semantic Layer GraphQL API allows you to explore and query metrics and dimensions. Due to its self-documenting nature, you can explore the calls conveniently through the [schema explorer](https://semantic-layer.cloud.getdbt.com/api/graphql). +The dbt Semantic Layer GraphQL API allows you to explore and query metrics and dimensions. Due to its self-documenting nature, you can explore the calls conveniently through a schema explorer. + +The schema explorer URLs vary depending on your [deployment region](/docs/cloud/about-cloud/regions-ip-addresses). Use the following table to find the right link for your region: + +| Deployment type | Schema explorer URL | +| --------------- | ------------------- | +| North America multi-tenant | https://semantic-layer.cloud.getdbt.com/api/graphql | +| EMEA multi-tenant | https://semantic-layer.emea.dbt.com/api/graphql | +| APAC multi-tenant | https://semantic-layer.au.dbt.com/api/graphql | +| Single tenant | `https://YOUR_ACCESS_URL.semantic-layer/api/graphql`

Replace `YOUR_ACCESS_URL` with your specific account prefix with the appropriate Access URL for your region and plan.| +| Multi-cell | `https://YOUR_ACCOUNT_PREFIX.semantic-layer.REGION.dbt.com/api/graphql`

Replace `YOUR_ACCOUNT_PREFIX` with your specific account identifier and `REGION` with your location, which could be `us1.dbt.com`. |
+ +**Example** +- If your Single tenant access URL is `ABC123.getdbt.com`, your schema explorer URL will be `https://ABC123.getdbt.com.semantic-layer/api/graphql`. dbt Partners can use the Semantic Layer GraphQL API to build an integration with the dbt Semantic Layer. +Note that the dbt Semantic Layer API doesn't support `ref` to call dbt objects. Instead, use the complete qualified table name. If you're using dbt macros at query time to calculate your metrics, you should move those calculations into your Semantic Layer metric definitions as code. + ## Requirements to use the GraphQL API - A dbt Cloud project on dbt v1.6 or higher - Metrics are defined and configured @@ -204,6 +219,8 @@ DimensionType = [CATEGORICAL, TIME] ### Querying +When querying for data, _either_ a `groupBy` _or_ a `metrics` selection is required. + **Create Dimension Values query** ```graphql @@ -428,22 +445,35 @@ mutation { } ``` +**Query a categorical dimension on its own** + +```graphql +mutation { + createQuery( + environmentId: 123456 + groupBy: [{name: "customer__customer_type"}] + ) { + queryId + } +} +``` + **Query with a where filter** The `where` filter takes a list argument (or a string for a single input). Depending on the object you are filtering, there are a couple of parameters: - - `Dimension()` — Used for any categorical or time dimensions. If used for a time dimension, granularity is required. For example, `Dimension('metric_time').grain('week')` or `Dimension('customer__country')`. + - `Dimension()` — Used for any categorical or time dimensions. For example, `Dimension('metric_time').grain('week')` or `Dimension('customer__country')`. - `Entity()` — Used for entities like primary and foreign keys, such as `Entity('order_id')`. -Note: If you prefer a more strongly typed `where` clause, you can optionally use `TimeDimension()` to separate out categorical dimensions from time ones. The `TimeDimension` input takes the time dimension name and also requires granularity. For example, `TimeDimension('metric_time', 'MONTH')`. +Note: If you prefer a `where` clause with a more explicit path, you can optionally use `TimeDimension()` to separate categorical dimensions from time ones. The `TimeDimension` input takes the time dimension and optionally the granularity level. `TimeDimension('metric_time', 'month')`. ```graphql mutation { createQuery( environmentId: BigInt! metrics:[{name: "order_total"}] - groupBy:[{name: "customer__customer_type"}, {name: "metric_time", grain: MONTH}] + groupBy:[{name: "customer__customer_type"}, {name: "metric_time", grain: month}] where:[{sql: "{{ Dimension('customer__customer_type') }} = 'new'"}, {sql:"{{ Dimension('metric_time').grain('month') }} > '2022-10-01'"}] ) { queryId @@ -451,6 +481,55 @@ mutation { } ``` +For both `TimeDimension()`, the grain is only required in the WHERE filter if the aggregation time dimensions for the measures and metrics associated with the where filter have different grains. + +For example, consider this Semantic model and Metric configuration, which contains two metrics that are aggregated across different time grains. This example shows a single semantic model, but the same goes for metrics across more than one semantic model. + +```yaml +semantic_model: + name: my_model_source + +defaults: + agg_time_dimension: created_month + measures: + - name: measure_0 + agg: sum + - name: measure_1 + agg: sum + agg_time_dimension: order_year + dimensions: + - name: created_month + type: time + type_params: + time_granularity: month + - name: order_year + type: time + type_params: + time_granularity: year + +metrics: + - name: metric_0 + description: A metric with a month grain. + type: simple + type_params: + measure: measure_0 + - name: metric_1 + description: A metric with a year grain. + type: simple + type_params: + measure: measure_1 +``` + +Assuming the user is querying `metric_0` and `metric_1` together, a valid filter would be: + + * `"{{ TimeDimension('metric_time', 'year') }} > '2020-01-01'"` + +Invalid filters would be: + + * ` "{{ TimeDimension('metric_time') }} > '2020-01-01'"` — metrics in the query are defined based on measures with different grains. + + * `"{{ TimeDimension('metric_time', 'month') }} > '2020-01-01'"` — `metric_1` is not available at a month grain. + **Query with Order** ```graphql diff --git a/website/docs/docs/dbt-cloud-apis/sl-jdbc.md b/website/docs/docs/dbt-cloud-apis/sl-jdbc.md index 345be39635e..2e928db6af2 100644 --- a/website/docs/docs/dbt-cloud-apis/sl-jdbc.md +++ b/website/docs/docs/dbt-cloud-apis/sl-jdbc.md @@ -33,6 +33,8 @@ You *may* be able to use our JDBC API with tools that do not have an official in Refer to [Get started with the dbt Semantic Layer](/docs/use-dbt-semantic-layer/quickstart-sl) for more info. +Note that the dbt Semantic Layer API doesn't support `ref` to call dbt objects. Instead, use the complete qualified table name. If you're using dbt macros at query time to calculate your metrics, you should move those calculations into your Semantic Layer metric definitions as code. + ## Authentication dbt Cloud authorizes requests to the dbt Semantic Layer API. You need to provide an environment ID, host, and [service account tokens](/docs/dbt-cloud-apis/service-tokens). @@ -90,9 +92,9 @@ select * from {{ -Use this query to fetch dimension values for one or multiple metrics and single dimension. +Use this query to fetch dimension values for one or multiple metrics and a single dimension. -Note, `metrics` is a required argument that lists one or multiple metrics in it, and a single dimension. +Note, `metrics` is a required argument that lists one or multiple metrics, and a single dimension. ```bash select * from {{ @@ -103,9 +105,9 @@ semantic_layer.dimension_values(metrics=['food_order_amount'], group_by=['custom -Use this query to fetch queryable granularities for a list of metrics. This API request allows you to only show the time granularities that make sense for the primary time dimension of the metrics (such as `metric_time`), but if you want queryable granularities for other time dimensions, you can use the `dimensions()` call, and find the column queryable_granularities. +You can use this query to fetch queryable granularities for a list of metrics. This API request allows you to only show the time granularities that make sense for the primary time dimension of the metrics (such as `metric_time`), but if you want queryable granularities for other time dimensions, you can use the `dimensions()` call, and find the column queryable_granularities. -Note, `metrics` is a required argument that lists one or multiple metrics in it. +Note, `metrics` is a required argument that lists one or multiple metrics. ```bash select * from {{ @@ -122,7 +124,7 @@ select * from {{ Use this query to fetch available metrics given dimensions. This command is essentially the opposite of getting dimensions given a list of metrics. -Note, `group_by` is a required argument that lists one or multiple dimensions in it. +Note, `group_by` is a required argument that lists one or multiple dimensions. ```bash select * from {{ @@ -135,7 +137,7 @@ select * from {{ -Use this example query to fetch available granularities for all time dimesensions (the similar queryable granularities API call only returns granularities for the primary time dimensions for metrics). The following call is a derivative of the `dimensions()` call and specifically selects the granularities field. +You can use this example query to fetch available granularities for all time dimensions (the similar queryable granularities API call only returns granularities for the primary time dimensions for metrics). The following call is a derivative of the `dimensions()` call and specifically selects the granularity field. ```bash select NAME, QUERYABLE_GRANULARITIES from {{ @@ -177,8 +179,6 @@ To query metric values, here are the following parameters that are available. Yo |`order` | Order the data returned by a particular field | `order_by=['order_gross_profit']`, use `-` for descending, or full object notation if the object is operated on: `order_by=[Metric('order_gross_profit').descending(True)`] | | `compile` | If true, returns generated SQL for the data platform but does not execute | `compile=True` | - - ## Note on time dimensions and `metric_time` You will notice that in the list of dimensions for all metrics, there is a dimension called `metric_time`. `Metric_time` is a reserved keyword for the measure-specific aggregation time dimensions. For any time-series metric, the `metric_time` keyword should always be available for use in queries. This is a common dimension across *all* metrics in a semantic graph. @@ -264,11 +264,62 @@ Where filters in API allow for a filter list or string. We recommend using the f Where Filters have a few objects that you can use: -- `Dimension()` - Used for any categorical or time dimensions. If used for a time dimension, granularity is required - `Dimension('metric_time').grain('week')` or `Dimension('customer__country')` +- `Dimension()` — Used for any categorical or time dimensions. `Dimension('metric_time').grain('week')` or `Dimension('customer__country')`. + +- `TimeDimension()` — Used as a more explicit definition for time dimensions, optionally takes in a granularity `TimeDimension('metric_time', 'month')`. + +- `Entity()` — Used for entities like primary and foreign keys - `Entity('order_id')`. + + +For `TimeDimension()`, the grain is only required in the `WHERE` filter if the aggregation time dimensions for the measures and metrics associated with the where filter have different grains. + +For example, consider this Semantic model and Metric config, which contains two metrics that are aggregated across different time grains. This example shows a single semantic model, but the same goes for metrics across more than one semantic model. + +```yaml +semantic_model: + name: my_model_source + +defaults: + agg_time_dimension: created_month + measures: + - name: measure_0 + agg: sum + - name: measure_1 + agg: sum + agg_time_dimension: order_year + dimensions: + - name: created_month + type: time + type_params: + time_granularity: month + - name: order_year + type: time + type_params: + time_granularity: year + +metrics: + - name: metric_0 + description: A metric with a month grain. + type: simple + type_params: + measure: measure_0 + - name: metric_1 + description: A metric with a year grain. + type: simple + type_params: + measure: measure_1 + +``` + +Assuming the user is querying `metric_0` and `metric_1` together in a single request, a valid `WHERE` filter would be: + + * `"{{ TimeDimension('metric_time', 'year') }} > '2020-01-01'"` -- `Entity()` - Used for entities like primary and foreign keys - `Entity('order_id')` +Invalid filters would be: -Note: If you prefer a more explicit path to create the `where` clause, you can optionally use the `TimeDimension` feature. This helps separate out categorical dimensions from time-related ones. The `TimeDimesion` input takes the time dimension name and also requires granularity, like this: `TimeDimension('metric_time', 'MONTH')`. + * `"{{ TimeDimension('metric_time') }} > '2020-01-01'"` — metrics in the query are defined based on measures with different grains. + + * `"{{ TimeDimension('metric_time', 'month') }} > '2020-01-01'"` — `metric_1` is not available at a month grain. - Use the following example to query using a `where` filter with the string format: @@ -287,13 +338,13 @@ where="{{ Dimension('metric_time').grain('month') }} >= '2017-03-09' AND {{ Dim select * from {{ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'], group_by=[Dimension('metric_time').grain('month'),'customer__customer_type'], -where=["{{ Dimension('metric_time').grain('month') }} >= '2017-03-09'", "{{ Dimension('customer__customer_type' }} in ('new')", "{{ Entity('order_id') }} = 10"] +where=["{{ Dimension('metric_time').grain('month') }} >= '2017-03-09'", "{{ Dimension('customer__customer_type') }} in ('new')", "{{ Entity('order_id') }} = 10"]) }} ``` ### Query with a limit -Use the following example to query using a `limit` or `order_by` clauses: +Use the following example to query using a `limit` or `order_by` clause: ```bash select * from {{ @@ -301,38 +352,40 @@ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'], group_by=[Dimension('metric_time')], limit=10) }} -``` +``` + ### Query with Order By Examples -Order By can take a basic string that's a Dimension, Metric, or Entity and this will default to ascending order +Order By can take a basic string that's a Dimension, Metric, or Entity, and this will default to ascending order ```bash select * from {{ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'], group_by=[Dimension('metric_time')], limit=10, - order_by=['order_gross_profit'] + order_by=['order_gross_profit']) }} ``` -For descending order, you can add a `-` sign in front of the object. However, you can only use this short hand notation if you aren't operating on the object or using the full object notation. +For descending order, you can add a `-` sign in front of the object. However, you can only use this short-hand notation if you aren't operating on the object or using the full object notation. ```bash select * from {{ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'], group_by=[Dimension('metric_time')], limit=10, - order_by=[-'order_gross_profit'] + order_by=[-'order_gross_profit']) }} -``` -If you are ordering by an object that's been operated on (e.g., change granularity), or you are using the full object notation, descending order must look like: +``` + +If you are ordering by an object that's been operated on (for example, you changed the granularity of the time dimension), or you are using the full object notation, descending order must look like: ```bash select * from {{ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'], group_by=[Dimension('metric_time').grain('week')], limit=10, - order_by=[Metric('order_gross_profit').descending(True), Dimension('metric_time').grain('week').descending(True) ] + order_by=[Metric('order_gross_profit').descending(True), Dimension('metric_time').grain('week').descending(True) ]) }} ``` @@ -343,7 +396,7 @@ select * from {{ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'], group_by=[Dimension('metric_time').grain('week')], limit=10, - order_by=[Metric('order_gross_profit'), Dimension('metric_time').grain('week')] + order_by=[Metric('order_gross_profit'), Dimension('metric_time').grain('week')]) }} ``` @@ -364,14 +417,24 @@ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'], -- **Why do some dimensions use different syntax, like `metric_time` versus `[Dimension('metric_time')`?**
- When you select a dimension on its own, such as `metric_time` you can use the shorthand method which doesn't need the “Dimension” syntax. However, when you perform operations on the dimension, such as adding granularity, the object syntax `[Dimension('metric_time')` is required. + +When you select a dimension on its own, such as `metric_time` you can use the shorthand method which doesn't need the “Dimension” syntax. + +However, when you perform operations on the dimension, such as adding granularity, the object syntax `[Dimension('metric_time')` is required. + + + + +The double underscore `"__"` syntax indicates a mapping from an entity to a dimension, as well as where the dimension is located. For example, `user__country` means someone is looking at the `country` dimension from the `user` table. + + + + +The default output follows the format `{{time_dimension_name}__{granularity_level}}`. -- **What does the double underscore `"__"` syntax in dimensions mean?**
- The double underscore `"__"` syntax indicates a mapping from an entity to a dimension, as well as where the dimension is located. For example, `user__country` means someone is looking at the `country` dimension from the `user` table. +So for example, if the `time_dimension_name` is `ds` and the granularity level is yearly, the output is `ds__year`. -- **What is the default output when adding granularity?**
- The default output follows the format `{time_dimension_name}__{granularity_level}`. So for example, if the time dimension name is `ds` and the granularity level is yearly, the output is `ds__year`. +
## Related docs diff --git a/website/docs/docs/dbt-versions/core-upgrade/00-upgrading-to-v1.7.md b/website/docs/docs/dbt-versions/core-upgrade/00-upgrading-to-v1.7.md index af098860e6f..1f40aaa9f40 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/00-upgrading-to-v1.7.md +++ b/website/docs/docs/dbt-versions/core-upgrade/00-upgrading-to-v1.7.md @@ -5,10 +5,6 @@ description: New features and changes in dbt Core v1.7 displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ## Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/8aaed0e29f9560bc53d9d3e88325a9597318e375/CHANGELOG.md) diff --git a/website/docs/docs/dbt-versions/core-upgrade/01-upgrading-to-v1.6.md b/website/docs/docs/dbt-versions/core-upgrade/01-upgrading-to-v1.6.md index 33a038baa9b..a70f220edc8 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/01-upgrading-to-v1.6.md +++ b/website/docs/docs/dbt-versions/core-upgrade/01-upgrading-to-v1.6.md @@ -5,10 +5,6 @@ id: "upgrading-to-v1.6" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - dbt Core v1.6 has three significant areas of focus: 1. Next milestone of [multi-project deployments](https://github.com/dbt-labs/dbt-core/discussions/6725): improvements to contracts, groups/access, versions; and building blocks for cross-project `ref` 1. Semantic layer re-launch: dbt Core and [MetricFlow](https://docs.getdbt.com/docs/build/about-metricflow) integration @@ -79,7 +75,7 @@ Support for BigQuery coming soon. [**Deprecation date**](/reference/resource-properties/deprecation_date): Models can declare a deprecation date that will warn model producers and downstream consumers. This enables clear migration windows for versioned models, and provides a mechanism to facilitate removal of immature or little-used models, helping to avoid project bloat. -[Model names](/faqs/Models/unique-model-names) can be duplicated across different namespaces (projects/packages), so long as they are unique within each project/package. We strongly encourage using [two-argument `ref`](/reference/dbt-jinja-functions/ref#two-argument-variant) when referencing a model from a different package/project. +[Model names](/faqs/Models/unique-model-names) can be duplicated across different namespaces (projects/packages), so long as they are unique within each project/package. We strongly encourage using [two-argument `ref`](/reference/dbt-jinja-functions/ref#ref-project-specific-models) when referencing a model from a different package/project. More consistency and flexibility around packages. Resources defined in a package will respect variable and global macro definitions within the scope of that package. - `vars` defined in a package's `dbt_project.yml` are now available in the resolution order when compiling nodes in that package, though CLI `--vars` and the root project's `vars` will still take precedence. See ["Variable Precedence"](/docs/build/project-variables#variable-precedence) for details. diff --git a/website/docs/docs/dbt-versions/core-upgrade/02-upgrading-to-v1.5.md b/website/docs/docs/dbt-versions/core-upgrade/02-upgrading-to-v1.5.md index e739caa477a..589ac162088 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/02-upgrading-to-v1.5.md +++ b/website/docs/docs/dbt-versions/core-upgrade/02-upgrading-to-v1.5.md @@ -5,10 +5,6 @@ id: "upgrading-to-v1.5" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - dbt Core v1.5 is a feature release, with two significant additions: 1. [**Model governance**](/docs/collaborate/govern/about-model-governance) — access, contracts, versions — the first phase of [multi-project deployments](https://github.com/dbt-labs/dbt-core/discussions/6725) 2. A Python entry point for [**programmatic invocations**](/reference/programmatic-invocations), at parity with the CLI diff --git a/website/docs/docs/dbt-versions/core-upgrade/03-upgrading-to-dbt-utils-v1.0.md b/website/docs/docs/dbt-versions/core-upgrade/03-upgrading-to-dbt-utils-v1.0.md index 229a54627fc..a8bb960c37d 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/03-upgrading-to-dbt-utils-v1.0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/03-upgrading-to-dbt-utils-v1.0.md @@ -3,10 +3,6 @@ title: "Upgrading to dbt utils v1.0" description: New features and breaking changes to consider as you upgrade to dbt utils v1.0. --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - # Upgrading to dbt utils v1.0 For the first time, [dbt utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) is crossing the major version boundary. From [last month’s blog post](https://www.getdbt.com/blog/announcing-dbt-v1.3-and-utils/): diff --git a/website/docs/docs/dbt-versions/core-upgrade/04-upgrading-to-v1.4.md b/website/docs/docs/dbt-versions/core-upgrade/04-upgrading-to-v1.4.md index 240f0b86de3..41e19956690 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/04-upgrading-to-v1.4.md +++ b/website/docs/docs/dbt-versions/core-upgrade/04-upgrading-to-v1.4.md @@ -5,10 +5,6 @@ id: "upgrading-to-v1.4" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.4.latest/CHANGELOG.md) diff --git a/website/docs/docs/dbt-versions/core-upgrade/05-upgrading-to-v1.3.md b/website/docs/docs/dbt-versions/core-upgrade/05-upgrading-to-v1.3.md index 5a381b16928..7febb0bade9 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/05-upgrading-to-v1.3.md +++ b/website/docs/docs/dbt-versions/core-upgrade/05-upgrading-to-v1.3.md @@ -5,10 +5,6 @@ id: "upgrading-to-v1.3" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.3.latest/CHANGELOG.md) diff --git a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.2.md b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.2.md index cd75e7f411b..17e62c90b43 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.2.md +++ b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.2.md @@ -5,10 +5,6 @@ id: "upgrading-to-v1.2" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.2.latest/CHANGELOG.md) diff --git a/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.1.md b/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.1.md index 868f3c7ed04..aee3413e1ad 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.1.md +++ b/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.1.md @@ -5,10 +5,6 @@ id: "upgrading-to-v1.1" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.1.latest/CHANGELOG.md) diff --git a/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.0.md b/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.0.md index 0ea66980874..9cbfae50831 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.0.md @@ -5,9 +5,6 @@ id: "upgrading-to-v1.0" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - ### Resources diff --git a/website/docs/docs/dbt-versions/core-upgrade/09-upgrading-to-v0.21.md b/website/docs/docs/dbt-versions/core-upgrade/09-upgrading-to-v0.21.md index d5b429132cd..5575b0cc2af 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/09-upgrading-to-v0.21.md +++ b/website/docs/docs/dbt-versions/core-upgrade/09-upgrading-to-v0.21.md @@ -5,10 +5,6 @@ displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - :::caution Unsupported version dbt Core v0.21 has reached the end of critical support. No new patch versions will be released, and it will stop running in dbt Cloud on June 30, 2022. Read ["About dbt Core versions"](/docs/dbt-versions/core) for more details. diff --git a/website/docs/docs/dbt-versions/core-upgrade/10-upgrading-to-v0.20.md b/website/docs/docs/dbt-versions/core-upgrade/10-upgrading-to-v0.20.md index be6054087b3..d95b8d8bacd 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/10-upgrading-to-v0.20.md +++ b/website/docs/docs/dbt-versions/core-upgrade/10-upgrading-to-v0.20.md @@ -4,10 +4,6 @@ id: "upgrading-to-v0.20" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - :::caution Unsupported version dbt Core v0.20 has reached the end of critical support. No new patch versions will be released, and it will stop running in dbt Cloud on June 30, 2022. Read ["About dbt Core versions"](/docs/dbt-versions/core) for more details. ::: diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-11-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-11-0.md index e91dde4c923..27c0456660f 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-11-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-11-0.md @@ -4,10 +4,6 @@ id: "upgrading-to-0-11-0" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ## Schema.yml v2 syntax dbt v0.11.0 adds an auto-generated docs site to your dbt project. To make effective use of the documentation site, you'll need to use the new "version 2" schema.yml syntax. For a full explanation of the version 2 syntax, check out the [schema.yml Files](/reference/configs-and-properties) section of the documentation. diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-12-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-12-0.md index b3d4e9d9bcb..a95ec3b11bd 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-12-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-12-0.md @@ -4,10 +4,6 @@ id: "upgrading-to-0-12-0" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ## End of support Support for the `repositories:` block in `dbt_project.yml` (deprecated in 0.10.0) was removed. diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-13-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-13-0.md index bb15d1a73b0..9875eb3c346 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-13-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-13-0.md @@ -4,10 +4,6 @@ id: "upgrading-to-0-13-0" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ## Breaking changes ### on-run-start and on-run-end diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md index 48aa14a42e5..21cfbe8d3b5 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md @@ -4,10 +4,6 @@ id: "upgrading-to-0-14-0" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - This guide outlines migration instructions for: 1. [Upgrading archives to snapshots](#upgrading-to-snapshot-blocks) diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-1.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-1.md index 215385acf0f..559775644cd 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-1.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-1.md @@ -4,10 +4,6 @@ id: "upgrading-to-0-14-1" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - The dbt v0.14.1 release _does not_ contain any breaking code changes for users upgrading from v0.14.0. If you are upgrading from a version less than 0.14.0, consult the [Upgrading to 0.14.0](upgrading-to-0-14-0) migration guide. The following section contains important information for users of the `check` strategy on Snowflake and BigQuery. Action may be required in your database. ## Changes to the Snapshot "check" algorithm diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-15-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-15-0.md index 5eba212590f..7db64f5940f 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-15-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-15-0.md @@ -4,10 +4,6 @@ id: "upgrading-to-0-15-0" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - The dbt v0.15.0 release contains a handful of breaking code changes for users upgrading from v0.14.0. diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-16-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-16-0.md index 076e6fc4e88..d6fc6f9f49a 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-16-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-16-0.md @@ -4,10 +4,6 @@ id: "upgrading-to-0-16-0" displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - dbt v0.16.0 contains many new features, bug fixes, and improvements. This guide covers all of the important information to consider when upgrading from an earlier version of dbt to 0.16.0. diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-17-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-17-0.md index 5b863777df9..b99466e7c9a 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-17-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-17-0.md @@ -5,10 +5,6 @@ displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - dbt v0.17.0 makes compilation more consistent, improves performance, and fixes a number of bugs. ## Articles: @@ -252,8 +248,8 @@ BigQuery: **Core** - [`path:` selectors](/reference/node-selection/methods#the-path-method) -- [`--fail-fast`](/reference/commands/run#failing-fast) -- [as_text Jinja filter](/reference/dbt-jinja-functions/as_text) +- [`--fail-fast` command](/reference/commands/run#failing-fast) +- `as_text` Jinja filter: removed this defunct filter - [accessing nodes in the `graph` object](/reference/dbt-jinja-functions/graph) - [persist_docs](/reference/resource-configs/persist_docs) - [source properties](reference/source-properties) diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-18-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-18-0.md index 545bfd41ac6..f14fd03a534 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-18-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-18-0.md @@ -4,10 +4,6 @@ displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/dev/marian-anderson/CHANGELOG.md) diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-19-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-19-0.md index db825d8af9c..af978f9c6a9 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-19-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-19-0.md @@ -4,10 +4,6 @@ displayed_sidebar: "docs" --- -import UpgradeMove from '/snippets/_upgrade-move.md'; - - - ### Resources - [Discourse](https://discourse.getdbt.com/t/1951) diff --git a/website/docs/docs/dbt-versions/release-notes/73-Jan-2024/partial-parsing.md b/website/docs/docs/dbt-versions/release-notes/73-Jan-2024/partial-parsing.md new file mode 100644 index 00000000000..c0236a30783 --- /dev/null +++ b/website/docs/docs/dbt-versions/release-notes/73-Jan-2024/partial-parsing.md @@ -0,0 +1,15 @@ +--- +title: "New: Native support for partial parsing" +description: "December 2023: For faster run times with your dbt invocations, configure dbt Cloud to parse only the changed files in your project." +sidebar_label: "New: Native support for partial parsing" +sidebar_position: 09 +tags: [Jan-2024] +date: 2024-01-03 +--- + +By default, dbt parses all the files in your project at the beginning of every dbt invocation. Depending on the size of your project, this operation can take a long time to complete. With the new partial parsing feature in dbt Cloud, you can reduce the time it takes for dbt to parse your project. When enabled, dbt Cloud parses only the changed files in your project instead of parsing all the project files. As a result, your dbt invocations will take less time to run. + +To learn more, refer to [Partial parsing](/docs/deploy/deploy-environments#partial-parsing). + + + diff --git a/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/dec-sl-updates.md b/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/dec-sl-updates.md new file mode 100644 index 00000000000..401b43fb333 --- /dev/null +++ b/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/dec-sl-updates.md @@ -0,0 +1,22 @@ +--- +title: "dbt Semantic Layer updates for December 2023" +description: "December 2023: Enhanced Tableau integration, BIGINT support, LookML to MetricFlow conversion, and deprecation of legacy features." +sidebar_label: "Update and fixes: dbt Semantic Layer" +sidebar_position: 08 +date: 2023-12-22 +--- +The dbt Labs team continues to work on adding new features, fixing bugs, and increasing reliability for the dbt Semantic Layer. The following list explains the updates and fixes for December 2023 in more detail. + +## Bug fixes + +- Tableau integration — The dbt Semantic Layer integration with Tableau now supports queries that resolve to a "NOT IN" clause. This applies to using "exclude" in the filtering user interface. Previously it wasn’t supported. +- `BIGINT` support — The dbt Semantic Layer can now support `BIGINT` values with precision greater than 18. Previously it would return an error. +- Memory leak — Fixed a memory leak in the JDBC API that would previously lead to intermittent errors when querying it. +- Data conversion support — Added support for converting various Redshift and Postgres-specific data types. Previously, the driver would throw an error when encountering columns with those types. + + +## Improvements + +- Deprecation — We deprecated [dbt Metrics and the legacy dbt Semantic Layer](/docs/dbt-versions/release-notes/Dec-2023/legacy-sl), both supported on dbt version 1.5 or lower. This change came into effect on December 15th, 2023. +- Improved dbt converter tool — The [dbt converter tool](https://github.com/dbt-labs/dbt-converter) can now help automate some of the work in converting from LookML (Looker's modeling language) for those who are migrating. Previously this wasn’t available. + diff --git a/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md index 7c35991e961..eff15e96cfd 100644 --- a/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md +++ b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md @@ -11,4 +11,4 @@ Now available for dbt Cloud Enterprise plans is a new option to enable Git repos To learn more, refer to [Repo caching](/docs/deploy/deploy-environments#git-repository-caching). - \ No newline at end of file + \ No newline at end of file diff --git a/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/sl-ga.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/sl-ga.md index a81abec5d42..9d5b91fb191 100644 --- a/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/sl-ga.md +++ b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/sl-ga.md @@ -8,7 +8,7 @@ tags: [Oct-2023] --- :::important -If you're using the legacy Semantic Layer, we **highly** recommend you [upgrade your dbt version](/docs/dbt-versions/upgrade-core-in-cloud) to dbt v1.6 or higher and [migrate](/guides/sl-migration) to the latest Semantic Layer. +If you're using the legacy Semantic Layer, we _highly_ recommend you [upgrade your dbt version](/docs/dbt-versions/upgrade-core-in-cloud) to dbt v1.6 or higher and [migrate](/guides/sl-migration) to the latest Semantic Layer. ::: dbt Labs is thrilled to announce that the [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) is now generally available. It offers consistent data organization, improved governance, reduced costs, enhanced efficiency, and accessible data for better decision-making and collaboration across organizations. @@ -18,7 +18,7 @@ It aims to bring the best of modeling and semantics to downstream applications b - Brand new [integrations](/docs/use-dbt-semantic-layer/avail-sl-integrations) such as Tableau, Google Sheets, Hex, Mode, and Lightdash. - New [Semantic Layer APIs](/docs/dbt-cloud-apis/sl-api-overview) using GraphQL and JDBC to query metrics and build integrations. - dbt Cloud [multi-tenant regional](/docs/cloud/about-cloud/regions-ip-addresses) support for North America, EMEA, and APAC. Single-tenant support coming soon. -- Use the APIs to call an export (a way to build tables in your data platform), then access them in your preferred BI tool. Starting from dbt v1.7 or higher, you will be able to schedule exports as part of your dbt job. +- Coming soon — Schedule exports (a way to build tables in your data platform) as part of your dbt Cloud job. Use the APIs to call an export, then access them in your preferred BI tool. diff --git a/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/sl-revamp-beta.md b/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/sl-revamp-beta.md index f44fd57aa4a..ac8e286c783 100644 --- a/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/sl-revamp-beta.md +++ b/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/sl-revamp-beta.md @@ -8,7 +8,7 @@ sidebar_position: 7 --- :::important -If you're using the legacy Semantic Layer, we **highly** recommend you [upgrade your dbt version](/docs/dbt-versions/upgrade-core-in-cloud) to dbt v1.6 or higher to use the new dbt Semantic Layer. To migrate to the new Semantic Layer, refer to the dedicated [migration guide](/guides/sl-migration) for more info. +If you're using the legacy Semantic Layer, we _highly_ recommend you [upgrade your dbt version](/docs/dbt-versions/upgrade-core-in-cloud) to dbt v1.6 or higher to use the new dbt Semantic Layer. To migrate to the new Semantic Layer, refer to the dedicated [migration guide](/guides/sl-migration) for more info. ::: dbt Labs are thrilled to announce the re-release of the [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl), now available in [public beta](#public-beta). It aims to bring the best of modeling and semantics to downstream applications by introducing: diff --git a/website/docs/docs/dbt-versions/release-notes/79-July-2023/faster-run.md b/website/docs/docs/dbt-versions/release-notes/79-July-2023/faster-run.md index ba82234c0b5..5cf1f97ff25 100644 --- a/website/docs/docs/dbt-versions/release-notes/79-July-2023/faster-run.md +++ b/website/docs/docs/dbt-versions/release-notes/79-July-2023/faster-run.md @@ -27,8 +27,12 @@ Jobs scheduled at the top of the hour used to take over 106 seconds to prepare b Our enhanced scheduler offers more durability and empowers users to run jobs effortlessly. -This means Enterprise, multi-tenant accounts can now enjoy the advantages of unlimited job concurrency. Previously limited to a fixed number of run slots, Enterprise accounts now have the freedom to operate without constraints. Single-tenant support will be coming soon. Team plan customers will continue to have only 2 run slots. +This means Enterprise, multi-tenant accounts can now enjoy the advantages of unlimited job concurrency. Previously limited to a fixed number of run slots, Enterprise accounts now have the freedom to operate without constraints. Single-tenant support will be coming soon. -Something to note, each running job occupies a run slot for its duration, and if all slots are occupied, jobs will queue accordingly. +Something to note, each running job occupies a run slot for its duration, and if all slots are occupied, jobs will queue accordingly. For more feature details, refer to the [dbt Cloud pricing page](https://www.getdbt.com/pricing/). + +Note, Team accounts created after July 2023 benefit from unlimited job concurrency: +- Legacy Team accounts have a fixed number of run slots. +- Both Team and Developer plans are limited to one project each. For larger-scale needs, our [Enterprise plan](https://www.getdbt.com/pricing/) offers features such as audit logging, unlimited job concurrency and projects, and more. diff --git a/website/docs/docs/dbt-versions/upgrade-core-in-cloud.md b/website/docs/docs/dbt-versions/upgrade-core-in-cloud.md index e46294029ec..052611f66e6 100644 --- a/website/docs/docs/dbt-versions/upgrade-core-in-cloud.md +++ b/website/docs/docs/dbt-versions/upgrade-core-in-cloud.md @@ -134,12 +134,6 @@ If you believe your project might be affected, read more details in the migratio

-:::info Important - -If you have not already, you must add `config-version: 2` to your dbt_project.yml file. -See **Upgrading to v0.17.latest from v0.16** below for more details. - -:::
diff --git a/website/docs/docs/deploy/ci-jobs.md b/website/docs/docs/deploy/ci-jobs.md index 149a6951fdc..9b96bb4b766 100644 --- a/website/docs/docs/deploy/ci-jobs.md +++ b/website/docs/docs/deploy/ci-jobs.md @@ -11,12 +11,12 @@ You can set up [continuous integration](/docs/deploy/continuous-integration) (CI dbt Labs recommends that you create your CI job in a dedicated dbt Cloud [deployment environment](/docs/deploy/deploy-environments#create-a-deployment-environment) that's connected to a staging database. Having a separate environment dedicated for CI will provide better isolation between your temporary CI schema builds and your production data builds. Additionally, sometimes teams need their CI jobs to be triggered when a PR is made to a branch other than main. If your team maintains a staging branch as part of your release process, having a separate environment will allow you to set a [custom branch](/faqs/environments/custom-branch-settings) and, accordingly, the CI job in that dedicated environment will be triggered only when PRs are made to the specified custom branch. To learn more, refer to [Get started with CI tests](/guides/set-up-ci). + ### Prerequisites -- You have a dbt Cloud account. +- You have a dbt Cloud account. - For the [Concurrent CI checks](/docs/deploy/continuous-integration#concurrent-ci-checks) and [Smart cancellation of stale builds](/docs/deploy/continuous-integration#smart-cancellation) features, your dbt Cloud account must be on the [Team or Enterprise plan](https://www.getdbt.com/pricing/). -- You must be connected using dbt Cloud’s native Git integration with [GitHub](/docs/cloud/git/connect-github), [GitLab](/docs/cloud/git/connect-gitlab), or [Azure DevOps](/docs/cloud/git/connect-azure-devops). - - With GitLab, you need a paid or self-hosted account which includes support for GitLab webhooks and [project access tokens](https://docs.gitlab.com/ee/user/project/settings/project_access_tokens.html). With GitLab Free, merge requests will invoke CI jobs but CI status updates (success or failure of the job) will not be reported back to GitLab. - - If you previously configured your dbt project by providing a generic git URL that clones using SSH, you must reconfigure the project to connect through dbt Cloud's native integration. +- Set up a [connection with your Git provider](/docs/cloud/git/git-configuration-in-dbt-cloud). This integration lets dbt Cloud run jobs on your behalf for job triggering. + - If you're using a native [GitLab](/docs/cloud/git/connect-gitlab) integration, you need a paid or self-hosted account that includes support for GitLab webhooks and [project access tokens](https://docs.gitlab.com/ee/user/project/settings/project_access_tokens.html). If you're using GitLab Free, merge requests will trigger CI jobs but CI job status updates (success or failure of the job) will not be reported back to GitLab. To make CI job creation easier, many options on the **CI job** page are set to default values that dbt Labs recommends that you use. If you don't want to use the defaults, you can change them. @@ -63,12 +63,13 @@ If you're not using dbt Cloud’s native Git integration with [GitHub](/docs/cl 1. Set up a CI job with the [Create Job](/dbt-cloud/api-v2#/operations/Create%20Job) API endpoint using `"job_type": ci` or from the [dbt Cloud UI](#set-up-ci-jobs). -1. Call the [Trigger Job Run](/dbt-cloud/api-v2#/operations/Trigger%20Job%20Run) API endpoint to trigger the CI job. You must include these fields to the payload: - - Provide the pull request (PR) ID with one of these fields, even if you're using a different Git provider (like Bitbucket). This can make your code less human-readable but it will _not_ affect dbt functionality. +1. Call the [Trigger Job Run](/dbt-cloud/api-v2#/operations/Trigger%20Job%20Run) API endpoint to trigger the CI job. You must include both of these fields to the payload: + - Provide the pull request (PR) ID using one of these fields: - `github_pull_request_id` - `gitlab_merge_request_id` - - `azure_devops_pull_request_id`  + - `azure_devops_pull_request_id` + - `non_native_pull_request_id` (for example, BitBucket) - Provide the `git_sha` or `git_branch` to target the correct commit or branch to run the job against. ## Example pull requests @@ -110,22 +111,6 @@ If you're experiencing any issues, review some of the common questions and answe -
- Reconnecting your dbt project to use dbt Cloud's native integration with GitHub, GitLab, or Azure DevOps -
-
If your dbt project relies the generic git clone method that clones using SSH and deploy keys to connect to your dbt repo, you need to disconnect your repo and reconnect it using the native GitHub, GitLab, or Azure DevOps integration in order to enable dbt Cloud CI.



- First, make sure you have the native GitHub authentication, native GitLab authentication, or native Azure DevOps authentication set up depending on which git provider you use. After you have gone through those steps, go to Account Settings, select Projects and click on the project you'd like to reconnect through native GitHub, GitLab, or Azure DevOps auth. Then click on the repository link.



- - Once you're in the repository page, select Edit and then Disconnect Repository at the bottom.

- -

- Confirm that you'd like to disconnect your repository. You should then see a new Configure a repository link in your old repository's place. Click through to the configuration page:

- -

- - Select the GitHub, GitLab, or AzureDevOps tab and reselect your repository. That should complete the setup of the project and enable you to set up a dbt Cloud CI job.
-
-
Error messages that refer to schemas from previous PRs
diff --git a/website/docs/docs/deploy/continuous-integration.md b/website/docs/docs/deploy/continuous-integration.md index 0f87965aada..22686c44bd2 100644 --- a/website/docs/docs/deploy/continuous-integration.md +++ b/website/docs/docs/deploy/continuous-integration.md @@ -50,3 +50,6 @@ When you push a new commit to a PR, dbt Cloud enqueues a new CI run for the late +### Run slot treatment + +For accounts on the [Enterprise or Team](https://www.getdbt.com/pricing) plans, CI runs won't consume run slots. This guarantees a CI check will never block a production run. \ No newline at end of file diff --git a/website/docs/docs/deploy/dashboard-status-tiles.md b/website/docs/docs/deploy/dashboard-status-tiles.md index 67aa1a93c33..d9e33fc32d6 100644 --- a/website/docs/docs/deploy/dashboard-status-tiles.md +++ b/website/docs/docs/deploy/dashboard-status-tiles.md @@ -56,7 +56,7 @@ Note that Mode has also built its own [integration](https://mode.com/get-dbt/) w Looker does not allow you to directly embed HTML and instead requires creating a [custom visualization](https://docs.looker.com/admin-options/platform/visualizations). One way to do this for admins is to: - Add a [new visualization](https://fishtown.looker.com/admin/visualizations) on the visualization page for Looker admins. You can use [this URL](https://metadata.cloud.getdbt.com/static/looker-viz.js) to configure a Looker visualization powered by the iFrame. It will look like this: - + - Once you have set up your custom visualization, you can use it on any dashboard! You can configure it with the exposure name, jobID, and token relevant to that dashboard. @@ -79,7 +79,7 @@ https://metadata.cloud.getdbt.com/exposure-tile?name=&jobId= + ### Sigma @@ -99,4 +99,4 @@ https://metadata.au.dbt.com/exposure-tile?name=&jobId=&to ``` ::: - + diff --git a/website/docs/docs/deploy/job-scheduler.md b/website/docs/docs/deploy/job-scheduler.md index fba76f677a7..7a4cd740804 100644 --- a/website/docs/docs/deploy/job-scheduler.md +++ b/website/docs/docs/deploy/job-scheduler.md @@ -31,7 +31,7 @@ Familiarize yourself with these useful terms to help you understand how the job | Over-scheduled job | A situation when a cron-scheduled job's run duration becomes longer than the frequency of the job’s schedule, resulting in a job queue that will grow faster than the scheduler can process the job’s runs. | | Prep time | The time dbt Cloud takes to create a short-lived environment to execute the job commands in the user's cloud data platform. Prep time varies most significantly at the top of the hour when the dbt Cloud Scheduler experiences a lot of run traffic. | | Run | A single, unique execution of a dbt job. | -| Run slot | Run slots control the number of jobs that can run concurrently. Developer and Team plan accounts have a fixed number of run slots, and Enterprise users have [unlimited run slots](/docs/dbt-versions/release-notes/July-2023/faster-run#unlimited-job-concurrency-for-enterprise-accounts). Each running job occupies a run slot for the duration of the run. If you need more jobs to execute in parallel, consider the [Enterprise plan](https://www.getdbt.com/pricing/) | +| Run slot | Run slots control the number of jobs that can run concurrently. Developer plans have a fixed number of run slots, while Enterprise and Team plans have [unlimited run slots](/docs/dbt-versions/release-notes/July-2023/faster-run#unlimited-job-concurrency-for-enterprise-accounts). Each running job occupies a run slot for the duration of the run.

Team and Developer plans are limited to one project each. For additional projects, consider upgrading to the [Enterprise plan](https://www.getdbt.com/pricing/).| | Threads | When dbt builds a project's DAG, it tries to parallelize the execution by using threads. The [thread](/docs/running-a-dbt-project/using-threads) count is the maximum number of paths through the DAG that dbt can work on simultaneously. The default thread count in a job is 4. | | Wait time | Amount of time that dbt Cloud waits before running a job, either because there are no available slots or because a previous run of the same job is still in progress. | diff --git a/website/docs/docs/running-a-dbt-project/using-threads.md b/website/docs/docs/running-a-dbt-project/using-threads.md index 5eede7abc27..af00dd9cc68 100644 --- a/website/docs/docs/running-a-dbt-project/using-threads.md +++ b/website/docs/docs/running-a-dbt-project/using-threads.md @@ -22,5 +22,5 @@ You will define the number of threads in your `profiles.yml` file (for dbt Core ## Related docs -- [About profiles.yml](https://docs.getdbt.com/reference/profiles.yml) +- [About profiles.yml](/docs/core/connect-data-platform/profiles.yml) - [dbt Cloud job scheduler](/docs/deploy/job-scheduler) diff --git a/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md b/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md index 665260ed9f4..11a610805a9 100644 --- a/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md +++ b/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md @@ -34,7 +34,7 @@ Use this guide to fully experience the power of the universal dbt Semantic Layer - [Define metrics](#define-metrics) in dbt using MetricFlow - [Test and query metrics](#test-and-query-metrics) with MetricFlow - [Run a production job](#run-a-production-job) in dbt Cloud -- [Set up dbt Semantic Layer](#setup) in dbt Cloud +- [Set up dbt Semantic Layer](#set-up-dbt-semantic-layer) in dbt Cloud - [Connect and query API](#connect-and-query-api) with dbt Cloud MetricFlow allows you to define metrics in your dbt project and query them whether in dbt Cloud or dbt Core with [MetricFlow commands](/docs/build/metricflow-commands). diff --git a/website/docs/faqs/API/rotate-token.md b/website/docs/faqs/API/rotate-token.md index 144c834ea8a..4470de72d5a 100644 --- a/website/docs/faqs/API/rotate-token.md +++ b/website/docs/faqs/API/rotate-token.md @@ -36,7 +36,7 @@ curl --location --request POST 'https://YOUR_ACCESS_URL/api/v2/users/YOUR_USER_I * Find your `YOUR_CURRENT_TOKEN` by going to **Profile Settings** -> **API Access** and copying the API key. * Find [`YOUR_ACCESS_URL`](/docs/cloud/about-cloud/regions-ip-addresses) for your region and plan. -:::info Example +Example: If `YOUR_USER_ID` = `123`, `YOUR_CURRENT_TOKEN` = `abcf9g`, and your `ACCESS_URL` = `cloud.getdbt.com`, then your curl request will be: @@ -44,7 +44,7 @@ If `YOUR_USER_ID` = `123`, `YOUR_CURRENT_TOKEN` = `abcf9g`, and your `ACCESS_URL curl --location --request POST 'https://cloud.getdbt.com/api/v2/users/123/apikey/' \ --header 'Authorization: Token abcf9g' ``` -::: + 2. Find the new key in the API response or in dbt Cloud. diff --git a/website/docs/faqs/Accounts/cloud-upgrade-instructions.md b/website/docs/faqs/Accounts/cloud-upgrade-instructions.md index f8daf393f9b..d16651a944c 100644 --- a/website/docs/faqs/Accounts/cloud-upgrade-instructions.md +++ b/website/docs/faqs/Accounts/cloud-upgrade-instructions.md @@ -6,11 +6,13 @@ description: "Instructions for upgrading a dbt Cloud account after the trial end dbt Cloud offers [several plans](https://www.getdbt.com/pricing/) with different features that meet your needs. This document is for dbt Cloud admins and explains how to select a plan in order to continue using dbt Cloud. -:::tip Before you begin -- You **_must_** be part of the [Owner](/docs/cloud/manage-access/self-service-permissions) user group to make billing changes. Users not included in this group will not see these options. +## Prerequisites + +Before you begin: +- You _must_ be part of the [Owner](/docs/cloud/manage-access/self-service-permissions) user group to make billing changes. Users not included in this group will not see these options. - All amounts shown in dbt Cloud are in U.S. Dollars (USD) - When your trial expires, your account's default plan enrollment will be a Team plan. -::: + ## Select a plan diff --git a/website/docs/faqs/Git/git-migration.md b/website/docs/faqs/Git/git-migration.md index 775ae3679e3..156227d59ae 100644 --- a/website/docs/faqs/Git/git-migration.md +++ b/website/docs/faqs/Git/git-migration.md @@ -16,7 +16,7 @@ To migrate from one git provider to another, refer to the following steps to avo 2. Go back to dbt Cloud and set up your [integration for the new git provider](/docs/cloud/git/connect-github), if needed. 3. Disconnect the old repository in dbt Cloud by going to **Account Settings** and then **Projects**. Click on the **Repository** link, then click **Edit** and **Disconnect**. - + 4. On the same page, connect to the new git provider repository by clicking **Configure Repository** - If you're using the native integration, you may need to OAuth to it. diff --git a/website/docs/faqs/Models/unique-model-names.md b/website/docs/faqs/Models/unique-model-names.md index c721fca7c6e..7878a5a704c 100644 --- a/website/docs/faqs/Models/unique-model-names.md +++ b/website/docs/faqs/Models/unique-model-names.md @@ -10,7 +10,7 @@ id: unique-model-names Within one project: yes! To build dependencies between models, you need to use the `ref` function, and pass in the model name as an argument. dbt uses that model name to uniquely resolve the `ref` to a specific model. As a result, these model names need to be unique, _even if they are in distinct folders_. -A model in one project can have the same name as a model in another project (installed as a dependency). dbt uses the project name to uniquely identify each model. We call this "namespacing." If you `ref` a model with a duplicated name, it will resolve to the model within the same namespace (package or project), or raise an error because of an ambiguous reference. Use [two-argument `ref`](/reference/dbt-jinja-functions/ref#two-argument-variant) to disambiguate references by specifying the namespace. +A model in one project can have the same name as a model in another project (installed as a dependency). dbt uses the project name to uniquely identify each model. We call this "namespacing." If you `ref` a model with a duplicated name, it will resolve to the model within the same namespace (package or project), or raise an error because of an ambiguous reference. Use [two-argument `ref`](/reference/dbt-jinja-functions/ref#ref-project-specific-models) to disambiguate references by specifying the namespace. Those models will still need to land in distinct locations in the data warehouse. Read the docs on [custom aliases](/docs/build/custom-aliases) and [custom schemas](/docs/build/custom-schemas) for details on how to achieve this. diff --git a/website/docs/guides/adapter-creation.md b/website/docs/guides/adapter-creation.md index 8bf082b04a0..28e0e8253ad 100644 --- a/website/docs/guides/adapter-creation.md +++ b/website/docs/guides/adapter-creation.md @@ -566,12 +566,6 @@ It should be noted that both of these files are included in the bootstrapped out ## Test your adapter -:::info - -Previously, we offered a packaged suite of tests for dbt adapter functionality: [`pytest-dbt-adapter`](https://github.com/dbt-labs/dbt-adapter-tests). We are deprecating that suite, in favor of the newer testing framework outlined in this document. - -::: - This document has two sections: 1. Refer to "About the testing framework" for a description of the standard framework that we maintain for using pytest together with dbt. It includes an example that shows the anatomy of a simple test case. diff --git a/website/docs/guides/bigquery-qs.md b/website/docs/guides/bigquery-qs.md index 9cf2447fa52..4f461a3cf3a 100644 --- a/website/docs/guides/bigquery-qs.md +++ b/website/docs/guides/bigquery-qs.md @@ -23,7 +23,6 @@ In this quickstart guide, you'll learn how to use dbt Cloud with BigQuery. It wi :::tip Videos for you You can check out [dbt Fundamentals](https://courses.getdbt.com/courses/fundamentals) for free if you're interested in course learning with videos. - ::: ### Prerequisites​ diff --git a/website/docs/guides/create-new-materializations.md b/website/docs/guides/create-new-materializations.md index af2732c0c39..52a8594b0d2 100644 --- a/website/docs/guides/create-new-materializations.md +++ b/website/docs/guides/create-new-materializations.md @@ -13,7 +13,7 @@ recently_updated: true ## Introduction -The model materializations you're familiar with, `table`, `view`, and `incremental` are implemented as macros in a package that's distributed along with dbt. You can check out the [source code for these materializations](https://github.com/dbt-labs/dbt-core/tree/main/core/dbt/include/global_project/macros/materializations). If you need to create your own materializations, reading these files is a good place to start. Continue reading below for a deep-dive into dbt materializations. +The model materializations you're familiar with, `table`, `view`, and `incremental` are implemented as macros in a package that's distributed along with dbt. You can check out the [source code for these materializations](https://github.com/dbt-labs/dbt-core/tree/main/core/dbt/adapters/include/global_project/macros/materializations). If you need to create your own materializations, reading these files is a good place to start. Continue reading below for a deep-dive into dbt materializations. :::caution diff --git a/website/docs/guides/custom-cicd-pipelines.md b/website/docs/guides/custom-cicd-pipelines.md index bd6d7617623..1778098f752 100644 --- a/website/docs/guides/custom-cicd-pipelines.md +++ b/website/docs/guides/custom-cicd-pipelines.md @@ -511,7 +511,7 @@ This section is only for those projects that connect to their git repository usi ::: -The setup for this pipeline will use the same steps as the prior page. Before moving on, **follow steps 1-5 from the [prior page](https://docs.getdbt.com/guides/orchestration/custom-cicd-pipelines/3-dbt-cloud-job-on-merge)** +The setup for this pipeline will use the same steps as the prior page. Before moving on, follow steps 1-5 from the [prior page](https://docs.getdbt.com/guides/custom-cicd-pipelines?step=2). ### 1. Create a pipeline job that runs when PRs are created diff --git a/website/docs/guides/databricks-qs.md b/website/docs/guides/databricks-qs.md index 5a0c5536e7f..cb01daec394 100644 --- a/website/docs/guides/databricks-qs.md +++ b/website/docs/guides/databricks-qs.md @@ -21,7 +21,6 @@ In this quickstart guide, you'll learn how to use dbt Cloud with Databricks. It :::tip Videos for you You can check out [dbt Fundamentals](https://courses.getdbt.com/courses/fundamentals) for free if you're interested in course learning with videos. - ::: ### Prerequisites​ diff --git a/website/docs/guides/debug-schema-names.md b/website/docs/guides/debug-schema-names.md index c7bf1a195b1..24b7984adf5 100644 --- a/website/docs/guides/debug-schema-names.md +++ b/website/docs/guides/debug-schema-names.md @@ -14,11 +14,8 @@ recently_updated: true ## Introduction -If a model uses the [`schema` config](/reference/resource-properties/schema) but builds under an unexpected schema, here are some steps for debugging the issue. +If a model uses the [`schema` config](/reference/resource-properties/schema) but builds under an unexpected schema, here are some steps for debugging the issue. The full explanation on custom schemas can be found [here](/docs/build/custom-schemas). -:::info -The full explanation on custom schemas can be found [here](/docs/build/custom-schemas). -::: You can also follow along via this video: @@ -94,9 +91,7 @@ Now, re-read through the logic of your `generate_schema_name` macro, and mentall You should find that the schema dbt is constructing for your model matches the output of your `generate_schema_name` macro. -:::info -Note that snapshots do not follow this behavior, check out the docs on [target_schema](/reference/resource-configs/target_schema) instead. -::: +Be careful. Snapshots do not follow this behavior, check out the docs on [target_schema](/reference/resource-configs/target_schema) instead. ## Adjust as necessary diff --git a/website/docs/guides/how-to-use-databricks-workflows-to-run-dbt-cloud-jobs.md b/website/docs/guides/how-to-use-databricks-workflows-to-run-dbt-cloud-jobs.md index cb3a6804247..a2967ccbe15 100644 --- a/website/docs/guides/how-to-use-databricks-workflows-to-run-dbt-cloud-jobs.md +++ b/website/docs/guides/how-to-use-databricks-workflows-to-run-dbt-cloud-jobs.md @@ -128,15 +128,14 @@ if __name__ == '__main__': 4. Replace **``** and **``** with the correct values of your environment and [Access URL](/docs/cloud/about-cloud/regions-ip-addresses) for your region and plan. -:::tip - To find these values, navigate to **dbt Cloud**, select **Deploy -> Jobs**. Select the Job you want to run and copy the URL. For example: `https://cloud.getdbt.com/deploy/000000/projects/111111/jobs/222222` -and therefore valid code would be: + * To find these values, navigate to **dbt Cloud**, select **Deploy -> Jobs**. Select the Job you want to run and copy the URL. For example: `https://cloud.getdbt.com/deploy/000000/projects/111111/jobs/222222` + and therefore valid code would be: - # Your URL is structured https:///deploy//projects//jobs/ +Your URL is structured `https:///deploy//projects//jobs/` account_id = 000000 job_id = 222222 base_url = "cloud.getdbt.com" -::: + 5. Run the Notebook. It will fail, but you should see **a `job_id` widget** at the top of your notebook. @@ -161,9 +160,7 @@ DbtJobRunStatus.RUNNING DbtJobRunStatus.SUCCESS ``` -:::note You can cancel the job from dbt Cloud if necessary. -::: ## Configure the workflows to run the dbt Cloud jobs diff --git a/website/docs/guides/manual-install-qs.md b/website/docs/guides/manual-install-qs.md index e9c1af259ac..fcd1e5e9599 100644 --- a/website/docs/guides/manual-install-qs.md +++ b/website/docs/guides/manual-install-qs.md @@ -70,7 +70,7 @@ $ pwd
-6. Update the following values in the `dbt_project.yml` file: +6. dbt provides the following values in the `dbt_project.yml` file: @@ -92,7 +92,7 @@ models: ## Connect to BigQuery -When developing locally, dbt connects to your using a [profile](/docs/core/connect-data-platform/connection-profiles), which is a YAML file with all the connection details to your warehouse. +When developing locally, dbt connects to your using a [profile](/docs/core/connect-data-platform/connection-profiles), which is a YAML file with all the connection details to your warehouse. 1. Create a file in the `~/.dbt/` directory named `profiles.yml`. 2. Move your BigQuery keyfile into this directory. diff --git a/website/docs/guides/redshift-qs.md b/website/docs/guides/redshift-qs.md index 890be27e50a..c81a4d247a5 100644 --- a/website/docs/guides/redshift-qs.md +++ b/website/docs/guides/redshift-qs.md @@ -18,10 +18,8 @@ In this quickstart guide, you'll learn how to use dbt Cloud with Redshift. It wi - Document your models - Schedule a job to run - -:::tip Videos for you +:::tips Videos for you You can check out [dbt Fundamentals](https://courses.getdbt.com/courses/fundamentals) for free if you're interested in course learning with videos. - ::: ### Prerequisites diff --git a/website/docs/guides/sl-migration.md b/website/docs/guides/sl-migration.md index 8ede40a6a2d..afa181646e3 100644 --- a/website/docs/guides/sl-migration.md +++ b/website/docs/guides/sl-migration.md @@ -25,21 +25,26 @@ dbt Labs recommends completing these steps in a local dev environment (such as t 1. Create new Semantic Model configs as YAML files in your dbt project.* 1. Upgrade the metrics configs in your project to the new spec.* 1. Delete your old metrics file or remove the `.yml` file extension so they're ignored at parse time. Remove the `dbt-metrics` package from your project. Remove any macros that reference `dbt-metrics`, like `metrics.calculate()`. Make sure that any packages you’re using don't have references to the old metrics spec. -1. Install the CLI with `python -m pip install "dbt-metricflow[your_adapter_name]"`. For example: +1. Install the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) to run MetricFlow commands and define your semantic model configurations. + - If you're using dbt Core, install the [MetricFlow CLI](/docs/build/metricflow-commands) with `python -m pip install "dbt-metricflow[your_adapter_name]"`. For example: ```bash python -m pip install "dbt-metricflow[snowflake]" ``` - **Note** - The MetricFlow CLI is not available in the IDE at this time. Support is coming soon. + **Note** - MetricFlow commands aren't yet supported in the dbt CLoud IDE at this time. -1. Run `dbt parse`. This parses your project and creates a `semantic_manifest.json` file in your target directory. MetricFlow needs this file to query metrics. If you make changes to your configs, you will need to parse your project again. -1. Run `mf list metrics` to view the metrics in your project. -1. Test querying a metric by running `mf query --metrics --group-by `. For example: +2. Run `dbt parse`. This parses your project and creates a `semantic_manifest.json` file in your target directory. MetricFlow needs this file to query metrics. If you make changes to your configs, you will need to parse your project again. +3. Run `mf list metrics` to view the metrics in your project. +4. Test querying a metric by running `mf query --metrics --group-by `. For example: ```bash mf query --metrics revenue --group-by metric_time ``` -1. Run `mf validate-configs` to run semantic and warehouse validations. This ensures your configs are valid and the underlying objects exist in your warehouse. -1. Push these changes to a new branch in your repo. +5. Run `mf validate-configs` to run semantic and warehouse validations. This ensures your configs are valid and the underlying objects exist in your warehouse. +6. Push these changes to a new branch in your repo. + +:::info `ref` not supported +The dbt Semantic Layer API doesn't support `ref` to call dbt objects. This is currently due to differences in architecture between the legacy Semantic Layer and the re-released Semantic Layer. Instead, use the complete qualified table name. If you're using dbt macros at query time to calculate your metrics, you should move those calculations into your Semantic Layer metric definitions as code. +::: **To make this process easier, dbt Labs provides a [custom migration tool](https://github.com/dbt-labs/dbt-converter) that automates these steps for you. You can find installation instructions in the [README](https://github.com/dbt-labs/dbt-converter/blob/master/README.md). Derived metrics aren’t supported in the migration tool, and will have to be migrated manually.* diff --git a/website/docs/guides/sl-partner-integration-guide.md b/website/docs/guides/sl-partner-integration-guide.md index 61d558f504d..7eb158a2c85 100644 --- a/website/docs/guides/sl-partner-integration-guide.md +++ b/website/docs/guides/sl-partner-integration-guide.md @@ -15,10 +15,7 @@ recently_updated: true To fit your tool within the world of the Semantic Layer, dbt Labs offers some best practice recommendations for how to expose metrics and allow users to interact with them seamlessly. -:::note This is an evolving guide that is meant to provide recommendations based on our experience. If you have any feedback, we'd love to hear it! -::: - ### Prerequisites diff --git a/website/docs/guides/snowflake-qs.md b/website/docs/guides/snowflake-qs.md index 5b4f9e3e2be..0401c37871f 100644 --- a/website/docs/guides/snowflake-qs.md +++ b/website/docs/guides/snowflake-qs.md @@ -26,7 +26,7 @@ You can check out [dbt Fundamentals](https://courses.getdbt.com/courses/fundamen You can also watch the [YouTube video on dbt and Snowflake](https://www.youtube.com/watch?v=kbCkwhySV_I&list=PL0QYlrC86xQm7CoOH6RS7hcgLnd3OQioG). ::: - + ### Prerequisites​ - You have a [dbt Cloud account](https://www.getdbt.com/signup/). diff --git a/website/docs/reference/analysis-properties.md b/website/docs/reference/analysis-properties.md index 880aeddbb0d..1601c817830 100644 --- a/website/docs/reference/analysis-properties.md +++ b/website/docs/reference/analysis-properties.md @@ -18,6 +18,7 @@ analyses: [description](/reference/resource-properties/description): [docs](/reference/resource-configs/docs): show: true | false + node_color: # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") config: [tags](/reference/resource-configs/tags): | [] columns: diff --git a/website/docs/reference/commands/debug.md b/website/docs/reference/commands/debug.md index 4ae5a1d2dd9..e1865ff1b67 100644 --- a/website/docs/reference/commands/debug.md +++ b/website/docs/reference/commands/debug.md @@ -7,7 +7,7 @@ id: "debug" `dbt debug` is a utility function to test the database connection and display information for debugging purposes, such as the validity of your project file and your installation of any requisite dependencies (like `git` when you run `dbt deps`). -*Note: Not to be confused with [debug-level logging](/reference/global-configs/about-global-configs#debug-level-logging) via the `--debug` option which increases verbosity. +*Note: Not to be confused with [debug-level logging](/reference/global-configs/logs#debug-level-logging) via the `--debug` option which increases verbosity. ### Example usage diff --git a/website/docs/reference/dbt-jinja-functions/as_text.md b/website/docs/reference/dbt-jinja-functions/as_text.md deleted file mode 100644 index 6b26cfa327d..00000000000 --- a/website/docs/reference/dbt-jinja-functions/as_text.md +++ /dev/null @@ -1,58 +0,0 @@ ---- -title: "About as_text filter" -sidebar_label: "as_text" -id: "as_text" -description: "Use this filter to convert Jinja-compiled output back to text." ---- - -The `as_text` Jinja filter will coerce Jinja-compiled output back to text. It -can be used in YAML rendering contexts where values _must_ be provided as -strings, rather than as the datatype that they look like. - -:::info Heads up -In dbt v0.17.1, native rendering is not enabled by default. As such, -the `as_text` filter has no functional effect. - -It is still possible to natively render specific values using the [`as_bool`](/reference/dbt-jinja-functions/as_bool), -[`as_number`](/reference/dbt-jinja-functions/as_number), and [`as_native`](/reference/dbt-jinja-functions/as_native) filters. - -::: - -### Usage - -In the example below, the `as_text` filter is used to assert that `''` is an -empty string. In a native rendering, `''` would be coerced to the Python -keyword `None`. This specification is necessary in `v0.17.0`, but it is not -useful or necessary in later versions of dbt. - - - -```yml -models: - - name: orders - columns: - - name: order_status - tests: - - accepted_values: - values: ['pending', 'shipped', "{{ '' | as_text }}"] - -``` - - - -As of `v0.17.1`, native rendering does not occur by default, and the `as_text` -specification is superfluous. - - - -```yml -models: - - name: orders - columns: - - name: order_status - tests: - - accepted_values: - values: ['pending', 'shipped', ''] -``` - - diff --git a/website/docs/reference/dbt-jinja-functions/builtins.md b/website/docs/reference/dbt-jinja-functions/builtins.md index edc5f34ffda..7d970b9d5e1 100644 --- a/website/docs/reference/dbt-jinja-functions/builtins.md +++ b/website/docs/reference/dbt-jinja-functions/builtins.md @@ -42,9 +42,9 @@ From dbt v1.5 and higher, use the following macro to extract user-provided argum -- call builtins.ref based on provided positional arguments {% set rel = None %} {% if packagename is not none %} - {% set rel = return(builtins.ref(packagename, modelname, version=version)) %} + {% set rel = builtins.ref(packagename, modelname, version=version) %} {% else %} - {% set rel = return(builtins.ref(modelname, version=version)) %} + {% set rel = builtins.ref(modelname, version=version) %} {% endif %} -- finally, override the database name with "dev" diff --git a/website/docs/reference/dbt-jinja-functions/cross-database-macros.md b/website/docs/reference/dbt-jinja-functions/cross-database-macros.md index 4df8275d4bd..334bcfe5760 100644 --- a/website/docs/reference/dbt-jinja-functions/cross-database-macros.md +++ b/website/docs/reference/dbt-jinja-functions/cross-database-macros.md @@ -30,6 +30,7 @@ Please make sure to take a look at the [SQL expressions section](#sql-expression - [type\_numeric](#type_numeric) - [type\_string](#type_string) - [type\_timestamp](#type_timestamp) + - [current\_timestamp](#current_timestamp) - [Set functions](#set-functions) - [except](#except) - [intersect](#intersect) @@ -76,6 +77,7 @@ Please make sure to take a look at the [SQL expressions section](#sql-expression - [type\_numeric](#type_numeric) - [type\_string](#type_string) - [type\_timestamp](#type_timestamp) + - [current\_timestamp](#current_timestamp) - [Set functions](#set-functions) - [except](#except) - [intersect](#intersect) @@ -316,6 +318,29 @@ This macro yields the database-specific data type for a `TIMESTAMP` (which may o TIMESTAMP ``` +### current_timestamp + +This macro returns the current date and time for the system. Depending on the adapter: + +- The result may be an aware or naive timestamp. +- The result may correspond to the start of the statement or the start of the transaction. + + +**Args** +- None + +**Usage** +- You can use the `current_timestamp()` macro within your dbt SQL files like this: + +```sql +{{ dbt.current_timestamp() }} +``` +**Sample output (PostgreSQL)** + +```sql +now() +``` + ## Set functions ### except diff --git a/website/docs/reference/dbt-jinja-functions/debug-method.md b/website/docs/reference/dbt-jinja-functions/debug-method.md index 0938970b50c..778ad095693 100644 --- a/website/docs/reference/dbt-jinja-functions/debug-method.md +++ b/website/docs/reference/dbt-jinja-functions/debug-method.md @@ -6,9 +6,9 @@ description: "The `{{ debug() }}` macro will open an iPython debugger." --- -:::caution New in v0.14.1 +:::warning Development environment only -The `debug` macro is new in dbt v0.14.1, and is only intended to be used in a development context with dbt. Do not deploy code to production which uses the `debug` macro. +The `debug` macro is only intended to be used in a development context with dbt. Do not deploy code to production that uses the `debug` macro. ::: diff --git a/website/docs/reference/dbt-jinja-functions/env_var.md b/website/docs/reference/dbt-jinja-functions/env_var.md index f4cc05cec0f..a8f2a94fbd2 100644 --- a/website/docs/reference/dbt-jinja-functions/env_var.md +++ b/website/docs/reference/dbt-jinja-functions/env_var.md @@ -100,6 +100,7 @@ select 1 as id -:::info dbt Cloud Usage +### dbt Cloud usage + If you are using dbt Cloud, you must adhere to the naming conventions for environment variables. Environment variables in dbt Cloud must be prefixed with `DBT_` (including `DBT_ENV_CUSTOM_ENV_` or `DBT_ENV_SECRET_`). Environment variables keys are uppercased and case sensitive. When referencing `{{env_var('DBT_KEY')}}` in your project's code, the key must match exactly the variable defined in dbt Cloud's UI. -::: + diff --git a/website/docs/reference/dbt-jinja-functions/ref.md b/website/docs/reference/dbt-jinja-functions/ref.md index fda5992e234..bc1f3f1ba9e 100644 --- a/website/docs/reference/dbt-jinja-functions/ref.md +++ b/website/docs/reference/dbt-jinja-functions/ref.md @@ -3,6 +3,7 @@ title: "About ref function" sidebar_label: "ref" id: "ref" description: "Read this guide to understand the builtins Jinja function in dbt." +keyword: dbt mesh, project dependencies, ref, cross project ref, project dependencies --- The most important function in dbt is `ref()`; it's impossible to build even moderately complex models without it. `ref()` is how you reference one model within another. This is a very common behavior, as typically models are built to be "stacked" on top of one another. Here is how this looks in practice: @@ -68,15 +69,19 @@ select * from {{ ref('model_name', version=1) }} select * from {{ ref('model_name') }} ``` -### Two-argument variant +### Ref project-specific models -You can also use a two-argument variant of the `ref` function. With this variant, you can pass both a namespace (project or package) and model name to `ref` to avoid ambiguity. When using two arguments with projects (not packages), you also need to set [cross project dependencies](/docs/collaborate/govern/project-dependencies). +You can also reference models from different projects using the two-argument variant of the `ref` function. By specifying both a namespace (which could be a project or package) and a model name, you ensure clarity and avoid any ambiguity in the `ref`. This is also useful when dealing with models across various projects or packages. + +When using two arguments with projects (not packages), you also need to set [cross project dependencies](/docs/collaborate/govern/project-dependencies). + +The following syntax demonstrates how to reference a model from a specific project or package: ```sql select * from {{ ref('project_or_package', 'model_name') }} ``` -We recommend using two-argument `ref` any time you are referencing a model defined in a different package or project. While not required in all cases, it's more explicit for you, for dbt, and for future readers of your code. +We recommend using two-argument `ref` any time you are referencing a model defined in a different package or project. While not required in all cases, it's more explicit for you, for dbt, and future readers of your code. diff --git a/website/docs/reference/dbt-jinja-functions/target.md b/website/docs/reference/dbt-jinja-functions/target.md index e7d08db592f..968f64d0f8d 100644 --- a/website/docs/reference/dbt-jinja-functions/target.md +++ b/website/docs/reference/dbt-jinja-functions/target.md @@ -1,20 +1,18 @@ --- -title: "About target variable" +title: "About target variables" sidebar_label: "target" id: "target" -description: "Contains information about your connection to the warehouse." +description: "The `target` variable contains information about your connection to the warehouse." --- -`target` contains information about your connection to the warehouse. +The `target` variable contains information about your connection to the warehouse. -* **dbt Core:** These values are based on the target defined in your [`profiles.yml` file](/docs/core/connect-data-platform/profiles.yml) -* **dbt Cloud Scheduler:** - * `target.name` is defined per job as described [here](/docs/build/custom-target-names). - * For all other attributes, the values are defined by the deployment connection. To check these values, click **Deploy** from the upper left and select **Environments**. Then, select the relevant deployment environment, and click **Settings**. -* **dbt Cloud IDE:** The values are defined by your connection and credentials. To check any of these values, head to your account (via your profile image in the top right hand corner), and select the project under "Credentials". +- **dbt Core:** These values are based on the target defined in your [profiles.yml](/docs/core/connect-data-platform/profiles.yml) file. Please note that for certain adapters, additional configuration steps may be required. Refer to the [set up page](/docs/core/connect-data-platform/about-core-connections) for your data platform. +- **dbt Cloud** To learn more about setting up your adapter in dbt Cloud, refer to [About data platform connections](/docs/cloud/connect-data-platform/about-connections). + - **[dbt Cloud Scheduler](/docs/deploy/job-scheduler)**: `target.name` is defined per job as described in [Custom target names](/docs/build/custom-target-names). For other attributes, values are defined by the deployment connection. To check these values, click **Deploy** and select **Environments**. Then, select the relevant deployment environment, and click **Settings**. + - **[dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud)**: These values are defined by your connection and credentials. To edit these values, click the gear icon in the top right, select **Profile settings**, and click **Credentials**. Select and edit a project to set up the credentials and target name. - -Some configs are shared between all adapters, while others are adapter-specific. +Some configurations are shared between all adapters, while others are adapter-specific. ## Common | Variable | Example | Description | @@ -54,6 +52,7 @@ Some configs are shared between all adapters, while others are adapter-specific. | `target.dataset` | dbt_alice | The dataset the active profile | ## Examples + ### Use `target.name` to limit data in dev As long as you use sensible target names, you can perform conditional logic to limit data when working in dev. @@ -68,6 +67,7 @@ where created_at >= dateadd('day', -3, current_date) ``` ### Use `target.name` to change your source database + If you have specific Snowflake databases configured for your dev/qa/prod environments, you can set up your sources to compile to different databases depending on your environment. diff --git a/website/docs/reference/dbt_project.yml.md b/website/docs/reference/dbt_project.yml.md index a5ad601f78b..ae911200b40 100644 --- a/website/docs/reference/dbt_project.yml.md +++ b/website/docs/reference/dbt_project.yml.md @@ -1,6 +1,8 @@ Every [dbt project](/docs/build/projects) needs a `dbt_project.yml` file — this is how dbt knows a directory is a dbt project. It also contains important information that tells dbt how to operate your project. +dbt uses [YAML](https://yaml.org/) in a few different places. If you're new to YAML, it would be worth learning how arrays, dictionaries, and strings are represented. + By default, dbt will look for `dbt_project.yml` in your current working directory and its parents, but you can set a different directory using the `--project-dir` flag. @@ -15,11 +17,6 @@ Starting from dbt v1.5 and higher, you can specify your dbt Cloud project ID in -:::info YAML syntax -dbt uses YAML in a few different places. If you're new to YAML, it would be worth taking the time to learn how arrays, dictionaries, and strings are represented. -::: - - Something to note, you can't set up a "property" in the `dbt_project.yml` file if it's not a config (an example is [macros](/reference/macro-properties)). This applies to all types of resources. Refer to [Configs and properties](/reference/configs-and-properties) for more detail. The following example is a list of all available configurations in the `dbt_project.yml` file: diff --git a/website/docs/reference/global-configs/print-output.md b/website/docs/reference/global-configs/print-output.md index 112b92b546f..78de635f2dd 100644 --- a/website/docs/reference/global-configs/print-output.md +++ b/website/docs/reference/global-configs/print-output.md @@ -8,35 +8,17 @@ sidebar: "Print output" -By default, dbt includes `print()` messages in standard out (stdout). You can use the `NO_PRINT` config to prevent these messages from showing up in stdout. - - - -```yaml -config: - no_print: true -``` - - +By default, dbt includes `print()` messages in standard out (stdout). You can use the `DBT_NO_PRINT` environment variable to prevent these messages from showing up in stdout. -By default, dbt includes `print()` messages in standard out (stdout). You can use the `PRINT` config to prevent these messages from showing up in stdout. - - - -```yaml -config: - print: false -``` - - +By default, dbt includes `print()` messages in standard out (stdout). You can use the `DBT_PRINT` environment variable to prevent these messages from showing up in stdout. :::warning Syntax deprecation -The original `NO_PRINT` syntax has been deprecated, starting with dbt v1.5. Backward compatibility is supported but will be removed in an as-of-yet-undetermined future release. +The original `DBT_NO_PRINT` environment variable has been deprecated, starting with dbt v1.5. Backward compatibility is supported but will be removed in an as-of-yet-undetermined future release. ::: @@ -46,8 +28,6 @@ Supply `--no-print` flag to `dbt run` to suppress `print()` messages from showin ```text dbt --no-print run -... - ``` ### Printer width diff --git a/website/docs/reference/global-configs/usage-stats.md b/website/docs/reference/global-configs/usage-stats.md index 1f9492f4a43..01465bcac2a 100644 --- a/website/docs/reference/global-configs/usage-stats.md +++ b/website/docs/reference/global-configs/usage-stats.md @@ -18,4 +18,3 @@ config: dbt Core users can also use the DO_NOT_TRACK environment variable to enable or disable sending anonymous data. For more information, see [Environment variables](/docs/build/environment-variables). `DO_NOT_TRACK=1` is the same as `DBT_SEND_ANONYMOUS_USAGE_STATS=False` -`DO_NOT_TRACK=0` is the same as `DBT_SEND_ANONYMOUS_USAGE_STATS=True` diff --git a/website/docs/reference/model-properties.md b/website/docs/reference/model-properties.md index 65f9307b5b3..46fb0ca3bad 100644 --- a/website/docs/reference/model-properties.md +++ b/website/docs/reference/model-properties.md @@ -16,6 +16,7 @@ models: [description](/reference/resource-properties/description): [docs](/reference/resource-configs/docs): show: true | false + node_color: # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") [latest_version](/reference/resource-properties/latest_version): [deprecation_date](/reference/resource-properties/deprecation_date): [access](/reference/resource-configs/access): private | protected | public diff --git a/website/docs/reference/node-selection/methods.md b/website/docs/reference/node-selection/methods.md index 61fd380e11b..549bc5d45e1 100644 --- a/website/docs/reference/node-selection/methods.md +++ b/website/docs/reference/node-selection/methods.md @@ -8,9 +8,6 @@ you can omit it (the default value will be one of `path`, `file` or `fqn`). -:::info New functionality -New in v1.5! -::: Many of the methods below support Unix-style wildcards: diff --git a/website/docs/reference/node-selection/syntax.md b/website/docs/reference/node-selection/syntax.md index 22946903b7d..61b53ea5ebd 100644 --- a/website/docs/reference/node-selection/syntax.md +++ b/website/docs/reference/node-selection/syntax.md @@ -158,7 +158,6 @@ If both the flag and env var are provided, the flag takes precedence. #### Notes: - The `--state` artifacts must be of schema versions that are compatible with the currently running dbt version. -- The path to state artifacts can be set via the `--state` flag or `DBT_ARTIFACT_STATE_PATH` environment variable. If both the flag and env var are provided, the flag takes precedence. - These are powerful, complex features. Read about [known caveats and limitations](/reference/node-selection/state-comparison-caveats) to state comparison. ### The "result" status @@ -174,7 +173,7 @@ The following dbt commands produce `run_results.json` artifacts whose results ca After issuing one of the above commands, you can reference the results by adding a selector to a subsequent command as follows: ```bash -# You can also set the DBT_ARTIFACT_STATE_PATH environment variable instead of the --state flag. +# You can also set the DBT_STATE environment variable instead of the --state flag. dbt run --select "result: --defer --state path/to/prod/artifacts" ``` diff --git a/website/docs/reference/parsing.md b/website/docs/reference/parsing.md index 1a68ba0d476..6eed4c96af0 100644 --- a/website/docs/reference/parsing.md +++ b/website/docs/reference/parsing.md @@ -41,7 +41,7 @@ The [`PARTIAL_PARSE` global config](/reference/global-configs/parsing) can be en Parse-time attributes (dependencies, configs, and resource properties) are resolved using the parse-time context. When partial parsing is enabled, and certain context variables change, those attributes will _not_ be re-resolved, and are likely to become stale. -In particular, you may see **incorrect results** if these attributes depend on "volatile" context variables, such as [`run_started_at`](/reference/dbt-jinja-functions/run_started_at), [`invocation_id`](/reference/dbt-jinja-functions/invocation_id), or [flags](/reference/dbt-jinja-functions/flags). These variables are likely (or even guaranteed!) to change in each invocation. We _highly discourage_ you from using these variables to set parse-time attributes (dependencies, configs, and resource properties). +In particular, you may see incorrect results if these attributes depend on "volatile" context variables, such as [`run_started_at`](/reference/dbt-jinja-functions/run_started_at), [`invocation_id`](/reference/dbt-jinja-functions/invocation_id), or [flags](/reference/dbt-jinja-functions/flags). These variables are likely (or even guaranteed!) to change in each invocation. dbt Labs _strongly discourages_ you from using these variables to set parse-time attributes (dependencies, configs, and resource properties). Starting in v1.0, dbt _will_ detect changes in environment variables. It will selectively re-parse only the files that depend on that [`env_var`](/reference/dbt-jinja-functions/env_var) value. (If the env var is used in `profiles.yml` or `dbt_project.yml`, a full re-parse is needed.) However, dbt will _not_ re-render **descriptions** that include env vars. If your descriptions include frequently changing env vars (this is highly uncommon), we recommend that you fully re-parse when generating documentation: `dbt --no-partial-parse docs generate`. @@ -51,7 +51,9 @@ If certain inputs change between runs, dbt will trigger a full re-parse. The res - `dbt_project.yml` content (or `env_var` values used within) - installed packages - dbt version -- certain widely-used macros, e.g. [builtins](/reference/dbt-jinja-functions/builtins) overrides or `generate_x_name` for `database`/`schema`/`alias` +- certain widely-used macros (for example, [builtins](/reference/dbt-jinja-functions/builtins), overrides, or `generate_x_name` for `database`/`schema`/`alias`) + +If you're triggering [CI](/docs/deploy/continuous-integration) job runs, the benefits of partial parsing are not applicable to new pull requests (PR) or new branches. However, they are applied on subsequent commits to the new PR or branch. If you ever get into a bad state, you can disable partial parsing and trigger a full re-parse by setting the `PARTIAL_PARSE` global config to false, or by deleting `target/partial_parse.msgpack` (e.g. by running `dbt clean`). diff --git a/website/docs/reference/project-configs/clean-targets.md b/website/docs/reference/project-configs/clean-targets.md index 9b464840723..8ca4065ed75 100644 --- a/website/docs/reference/project-configs/clean-targets.md +++ b/website/docs/reference/project-configs/clean-targets.md @@ -19,10 +19,10 @@ Optionally specify a custom list of directories to be removed by the `dbt clean` If this configuration is not included in your `dbt_project.yml` file, the `clean` command will remove files in your [target-path](/reference/project-configs/target-path). ## Examples -### Remove packages and compiled files as part of `dbt clean` -:::info -This is our preferred configuration, but is not the default. -::: + +### Remove packages and compiled files as part of `dbt clean` (preferred) {#remove-packages-and-compiled-files-as-part-of-dbt-clean} + + To remove packages as well as compiled files, include the value of your [packages-install-path](/reference/project-configs/packages-install-path) configuration in your `clean-targets` configuration. diff --git a/website/docs/reference/project-configs/docs-paths.md b/website/docs/reference/project-configs/docs-paths.md index 2aee7b31ee7..910cfbb0cce 100644 --- a/website/docs/reference/project-configs/docs-paths.md +++ b/website/docs/reference/project-configs/docs-paths.md @@ -20,12 +20,9 @@ Optionally specify a custom list of directories where [docs blocks](/docs/collab By default, dbt will search in all resource paths for docs blocks (i.e. the combined list of [model-paths](/reference/project-configs/model-paths), [seed-paths](/reference/project-configs/seed-paths), [analysis-paths](/reference/project-configs/analysis-paths), [macro-paths](/reference/project-configs/macro-paths) and [snapshot-paths](/reference/project-configs/snapshot-paths)). If this option is configured, dbt will _only_ look in the specified directory for docs blocks. -## Examples -:::info -We typically omit this configuration as we prefer dbt's default behavior. -::: +## Example -### Use a subdirectory named `docs` for docs blocks +Use a subdirectory named `docs` for docs blocks: @@ -34,3 +31,5 @@ docs-paths: ["docs"] ``` + +**Note:** We typically omit this configuration as we prefer dbt's default behavior. diff --git a/website/docs/reference/project-configs/require-dbt-version.md b/website/docs/reference/project-configs/require-dbt-version.md index 85a502bff60..6b17bb46120 100644 --- a/website/docs/reference/project-configs/require-dbt-version.md +++ b/website/docs/reference/project-configs/require-dbt-version.md @@ -19,7 +19,7 @@ When you set this configuration, dbt sends a helpful error message for any user If this configuration is not specified, no version check will occur. -:::info YAML Quoting +### YAML quoting This configuration needs to be interpolated by the YAML parser as a string. As such, you should quote the value of the configuration, taking care to avoid whitespace. For example: ```yml @@ -32,8 +32,6 @@ require-dbt-version: >=1.0.0 # No quotes? No good require-dbt-version: ">= 1.0.0" # Don't put whitespace after the equality signs ``` -::: - ## Examples @@ -73,18 +71,18 @@ require-dbt-version: ">=1.0.0,<2.0.0" ### Require a specific dbt version -:::caution Not recommended -With the release of major version 1.0 of dbt Core, pinning to a specific patch is discouraged. -::: + +:::info Not recommended +Pinning to a specific dbt version is discouraged because it limits project flexibility and can cause compatibility issues, especially with dbt packages. It's recommended to [pin to a major release](#pin-to-a-range), using a version range (for example, `">=1.0.0", "<2.0.0"`) for broader compatibility and to benefit from updates. While you can restrict your project to run only with an exact version of dbt Core, we do not recommend this for dbt Core v1.0.0 and higher. -In the following example, the project will only run with dbt v0.21.1. +In the following example, the project will only run with dbt v1.5: ```yml -require-dbt-version: 0.21.1 +require-dbt-version: 1.5 ``` diff --git a/website/docs/reference/resource-configs/bigquery-configs.md b/website/docs/reference/resource-configs/bigquery-configs.md index 8f323bc4236..94d06311c55 100644 --- a/website/docs/reference/resource-configs/bigquery-configs.md +++ b/website/docs/reference/resource-configs/bigquery-configs.md @@ -596,9 +596,9 @@ with events as ( -#### Copying ingestion-time partitions +#### Copying partitions -If you have configured your incremental model to use "ingestion"-based partitioning (`partition_by.time_ingestion_partitioning: True`), you can opt to use a legacy mechanism for inserting and overwriting partitions. While this mechanism doesn't offer the same visibility and ease of debugging as the SQL `merge` statement, it can yield significant savings in time and cost for large datasets. Behind the scenes, dbt will add or replace each partition via the [copy table API](https://cloud.google.com/bigquery/docs/managing-tables#copy-table) and partition decorators. +If you are replacing entire partitions in your incremental runs, you can opt to do so with the [copy table API](https://cloud.google.com/bigquery/docs/managing-tables#copy-table) and partition decorators rather than a `merge` statement. While this mechanism doesn't offer the same visibility and ease of debugging as the SQL `merge` statement, it can yield significant savings in time and cost for large datasets because the copy table API does not incur any costs for inserting the data - it's equivalent to the `bq cp` gcloud command line interface (CLI) command. You can enable this by switching on `copy_partitions: True` in the `partition_by` configuration. This approach works only in combination with "dynamic" partition replacement. diff --git a/website/docs/reference/resource-configs/contract.md b/website/docs/reference/resource-configs/contract.md index ccc10099a12..6c11b08dd62 100644 --- a/website/docs/reference/resource-configs/contract.md +++ b/website/docs/reference/resource-configs/contract.md @@ -6,16 +6,7 @@ default_value: {contract: false} id: "contract" --- -:::info New functionality -This functionality is new in v1.5. -::: - -## Related documentation -- [What is a model contract?](/docs/collaborate/govern/model-contracts) -- [Defining `columns`](/reference/resource-properties/columns) -- [Defining `constraints`](/reference/resource-properties/constraints) - -# Definition +Supported in dbt v1.5 and higher. When the `contract` configuration is enforced, dbt will ensure that your model's returned dataset exactly matches the attributes you have defined in yaml: - `name` and `data_type` for every column @@ -120,3 +111,8 @@ Imagine: - The result is a delta between the yaml-defined contract, and the actual table in the database - which means the contract is now incorrect! Why `append_new_columns`, rather than `sync_all_columns`? Because removing existing columns is a breaking change for contracted models! + +## Related documentation +- [What is a model contract?](/docs/collaborate/govern/model-contracts) +- [Defining `columns`](/reference/resource-properties/columns) +- [Defining `constraints`](/reference/resource-properties/constraints) \ No newline at end of file diff --git a/website/docs/reference/resource-configs/delimiter.md b/website/docs/reference/resource-configs/delimiter.md index 58d6ba8344a..5cc5ddaf44b 100644 --- a/website/docs/reference/resource-configs/delimiter.md +++ b/website/docs/reference/resource-configs/delimiter.md @@ -4,19 +4,14 @@ datatype: default_value: "," --- +Supported in v1.7 and higher. + ## Definition You can use this optional seed configuration to customize how you separate values in a [seed](/docs/build/seeds) by providing the one-character string. * The delimiter defaults to a comma when not specified. * Explicitly set the `delimiter` configuration value if you want seed files to use a different delimiter, such as "|" or ";". - -:::info New in 1.7! - -Delimiter is new functionality available beginning with dbt Core v1.7. - -::: - ## Usage diff --git a/website/docs/reference/resource-configs/docs.md b/website/docs/reference/resource-configs/docs.md index d5f7b6499d8..bb0f3714dd4 100644 --- a/website/docs/reference/resource-configs/docs.md +++ b/website/docs/reference/resource-configs/docs.md @@ -30,6 +30,7 @@ models: [](/reference/resource-configs/resource-path): +docs: show: true | false + node_color: color_id # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") ``` @@ -44,7 +45,7 @@ models: - name: model_name docs: show: true | false - node_color: "black" + node_color: color_id # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") ``` @@ -67,7 +68,7 @@ seeds: [](/reference/resource-configs/resource-path): +docs: show: true | false - + node_color: color_id # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") ``` @@ -81,6 +82,7 @@ seeds: - name: seed_name docs: show: true | false + node_color: color_id # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") ``` @@ -97,6 +99,7 @@ snapshots: [](/reference/resource-configs/resource-path): +docs: show: true | false + node_color: color_id # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") ``` @@ -111,6 +114,7 @@ snapshots: - name: snapshot_name docs: show: true | false + node_color: color_id # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") ``` @@ -130,6 +134,7 @@ analyses: - name: analysis_name docs: show: true | false + node_color: color_id # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") ``` @@ -156,7 +161,7 @@ macros: ## Definition -The docs field can be used to provide documentation-specific configuration to models. It supports the doc attribute `show`, which controls whether or not models are shown in the auto-generated documentation website. It also supports `node_color` for some node types. +The docs field can be used to provide documentation-specific configuration to models. It supports the doc attribute `show`, which controls whether or not models are shown in the auto-generated documentation website. It also supports `node_color` for models, seeds, snapshots, and analyses. Other node types are not supported. **Note:** Hidden models will still appear in the dbt DAG visualization but will be identified as "hidden.” @@ -204,9 +209,9 @@ models: ## Custom node colors -The `docs` attribute now supports `node_color` to customize the display color of some node types in the DAG within dbt docs. You can define node colors in the files below and apply overrides where needed. +The `docs` attribute now supports `node_color` to customize the display color of some node types in the DAG within dbt docs. You can define node colors in the following files and apply overrides where needed. Note, you need to run or re-run the command `dbt docs generate`. -`node_color` hiearchy: +`node_color` hierarchy: `` overrides `schema.yml` overrides `dbt_project.yml` diff --git a/website/docs/reference/resource-configs/full_refresh.md b/website/docs/reference/resource-configs/full_refresh.md index f75fe3a583b..c7f1b799087 100644 --- a/website/docs/reference/resource-configs/full_refresh.md +++ b/website/docs/reference/resource-configs/full_refresh.md @@ -74,7 +74,7 @@ Optionally set a resource to always or never full-refresh. -This logic is encoded in the [`should_full_refresh()`](https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/include/global_project/macros/materializations/configs.sql#L6) macro. +This logic is encoded in the [`should_full_refresh()`](https://github.com/dbt-labs/dbt-core/blob/main/core/dbt/adapters/include/global_project/macros/materializations/configs.sql#L6) macro. ## Usage diff --git a/website/docs/reference/resource-configs/group.md b/website/docs/reference/resource-configs/group.md index a71935013c4..e8370d18638 100644 --- a/website/docs/reference/resource-configs/group.md +++ b/website/docs/reference/resource-configs/group.md @@ -3,10 +3,6 @@ resource_types: [models, seeds, snapshots, tests, analyses, metrics] id: "group" --- -:::info New functionality -This functionality is new in v1.5. -::: - _strategy` -* [Source code](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/materializations/snapshots/strategies.sql#L65) for the timestamp strategy -* [Source code](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/include/global_project/macros/materializations/snapshots/strategies.sql#L131) for the check strategy +* [Source code](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/adapters/include/global_project/macros/materializations/snapshots/strategies.sql#L52) for the timestamp strategy +* [Source code](https://github.com/dbt-labs/dbt-core/blob/HEAD/core/dbt/adapters/include/global_project/macros/materializations/snapshots/strategies.sql#L136) for the check strategy It's possible to implement your own snapshot strategy by adding a macro with the same naming pattern to your project. For example, you might choose to create a strategy which records hard deletes, named `timestamp_with_deletes`. diff --git a/website/docs/reference/resource-configs/vertica-configs.md b/website/docs/reference/resource-configs/vertica-configs.md index 598bc3fecee..90badfe29ad 100644 --- a/website/docs/reference/resource-configs/vertica-configs.md +++ b/website/docs/reference/resource-configs/vertica-configs.md @@ -99,7 +99,7 @@ You can use `on_schema_change` parameter with values `ignore`, `fail` and `appen -#### Configuring the `apppend_new_columns` parameter +#### Configuring the `append_new_columns` parameter 0" tests: - - unique # primary_key constraint is not enforced + - unique # need this test because primary_key constraint is not enforced - name: customer_name data_type: text - name: first_transaction_date @@ -304,7 +300,7 @@ select
-BigQuery allows defining `not null` constraints. However, it does _not_ support or enforce the definition of unenforced constraints, such as `primary key`. +BigQuery allows defining and enforcing `not null` constraints, and defining (but _not_ enforcing) `primary key` and `foreign key` constraints (which can be used for query optimization). BigQuery does not support defining or enforcing other constraints. For more information, refer to [Platform constraint support](/docs/collaborate/govern/model-contracts#platform-constraint-support) Documentation: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language diff --git a/website/docs/reference/resource-properties/database.md b/website/docs/reference/resource-properties/database.md index c2f6ba76dd8..59159495435 100644 --- a/website/docs/reference/resource-properties/database.md +++ b/website/docs/reference/resource-properties/database.md @@ -26,12 +26,9 @@ The database that your source is stored in. Note that to use this parameter, your warehouse must allow cross-database queries. -:::info - #### BigQuery terminology -If you're using BigQuery, use the _project_ name as the `database:` property. -::: +If you're using BigQuery, use the _project_ name as the `database:` property. ## Default By default, dbt will search in your target database (i.e. the database that you are creating tables and views). diff --git a/website/docs/reference/resource-properties/freshness.md b/website/docs/reference/resource-properties/freshness.md index f332f5a1b8f..0b017991d68 100644 --- a/website/docs/reference/resource-properties/freshness.md +++ b/website/docs/reference/resource-properties/freshness.md @@ -37,6 +37,38 @@ A freshness block is used to define the acceptable amount of time between the mo In the `freshness` block, one or both of `warn_after` and `error_after` can be provided. If neither is provided, then dbt will not calculate freshness snapshots for the tables in this source. + + +In most cases, the `loaded_at_field` is required. Some adapters support calculating source freshness from the warehouse metadata tables and can exclude the `loaded_at_field`. + +If a source has a `freshness:` block, dbt will attempt to calculate freshness for that source: +- If a `loaded_at_field` is provided, dbt will calculate freshness via a select query (behavior prior to v1.7). +- If a `loaded_at_field` is _not_ provided, dbt will calculate freshness via warehouse metadata tables when possible (new in v1.7 on supported adapters). + +Currently, calculating freshness from warehouse metadata tables is supported on: +- [Snowflake](/reference/resource-configs/snowflake-configs) + +Support is coming soon to the following adapters: +- [Redshift](/reference/resource-configs/redshift-configs) +- [BigQuery](/reference/resource-configs/bigquery-configs) +- [Spark](/reference/resource-configs/spark-configs) + +Freshness blocks are applied hierarchically: +- a `freshness` and `loaded_at_field` property added to a source will be applied to all all tables defined in that source +- a `freshness` and `loaded_at_field` property added to a source _table_ will override any properties applied to the source. + +This is useful when all of the tables in a source have the same `loaded_at_field`, as is often the case. + +To exclude a source from freshness calculations, you have two options: +- Don't add a `freshness:` block. +- Explicitly set `freshness: null`. + +## loaded_at_field +(Optional on adapters that support pulling freshness from warehouse metadata tables, required otherwise.) + + + + Additionally, the `loaded_at_field` is required to calculate freshness for a table. If a `loaded_at_field` is not provided, then dbt will not calculate freshness for the table. Freshness blocks are applied hierarchically: @@ -47,7 +79,7 @@ This is useful when all of the tables in a source have the same `loaded_at_field ## loaded_at_field (Required) - + A column name (or expression) that returns a timestamp indicating freshness. If using a date field, you may have to cast it to a timestamp: diff --git a/website/docs/reference/resource-properties/schema.md b/website/docs/reference/resource-properties/schema.md index 9e6a09b8569..157a9ffc0a2 100644 --- a/website/docs/reference/resource-properties/schema.md +++ b/website/docs/reference/resource-properties/schema.md @@ -27,12 +27,10 @@ The schema name as stored in the database. This parameter is useful if you want to use a source name that differs from the schema name. -:::info #### BigQuery terminology -If you're using BigQuery, use the _dataset_ name as the `schema:` property. -::: +If you're using BigQuery, use the _dataset_ name as the `schema:` property. ## Default By default, dbt will use the source's `name:` parameter as the schema name. diff --git a/website/docs/reference/seed-properties.md b/website/docs/reference/seed-properties.md index 9201df65f4c..ebe222dd11c 100644 --- a/website/docs/reference/seed-properties.md +++ b/website/docs/reference/seed-properties.md @@ -16,6 +16,7 @@ seeds: [description](/reference/resource-properties/description): [docs](/reference/resource-configs/docs): show: true | false + node_color: # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") [config](/reference/resource-properties/config): [](/reference/seed-configs): [tests](/reference/resource-properties/data-tests): diff --git a/website/docs/reference/snapshot-properties.md b/website/docs/reference/snapshot-properties.md index 8f01fd8e988..49769af8f6d 100644 --- a/website/docs/reference/snapshot-properties.md +++ b/website/docs/reference/snapshot-properties.md @@ -20,6 +20,7 @@ snapshots: [meta](/reference/resource-configs/meta): {} [docs](/reference/resource-configs/docs): show: true | false + node_color: # Use name (such as node_color: purple) or hex code with quotes (such as node_color: "#cd7f32") [config](/reference/resource-properties/config): [](/reference/snapshot-configs): [tests](/reference/resource-properties/data-tests): diff --git a/website/docs/sql-reference/aggregate-functions/sql-avg.md b/website/docs/sql-reference/aggregate-functions/sql-avg.md index d1dba119292..1512cee7763 100644 --- a/website/docs/sql-reference/aggregate-functions/sql-avg.md +++ b/website/docs/sql-reference/aggregate-functions/sql-avg.md @@ -17,6 +17,8 @@ The AVG function is a part of the group of mathematical or aggregate functions ( ### AVG function example +The following example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop): + ```sql select date_trunc('month', order_date) as order_month, @@ -26,10 +28,6 @@ where status not in ('returned', 'return_pending') group by 1 ``` -:::note What dataset is this? -This example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop). -::: - This query using the Jaffle Shop’s `orders` table will return the rounded order amount per each order month: | order_month | avg_order_amount | diff --git a/website/docs/sql-reference/aggregate-functions/sql-count.md b/website/docs/sql-reference/aggregate-functions/sql-count.md index d65c670df90..1438b7c11d5 100644 --- a/website/docs/sql-reference/aggregate-functions/sql-count.md +++ b/website/docs/sql-reference/aggregate-functions/sql-count.md @@ -25,6 +25,8 @@ Let’s take a look at a practical example using COUNT, DISTINCT, and GROUP BY b ### COUNT example +The following example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop): + ```sql select date_part('month', order_date) as order_month, @@ -34,9 +36,6 @@ from {{ ref('orders') }} group by 1 ``` -:::note What dataset is this? -This example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop). -::: This simple query is something you may do while doing initial exploration of your data; it will return the count of `order_ids` and count of distinct `customer_ids` per order month that appear in the Jaffle Shop’s `orders` table: diff --git a/website/docs/sql-reference/aggregate-functions/sql-max.md b/website/docs/sql-reference/aggregate-functions/sql-max.md index 0b5dc5521ea..fab72770af5 100644 --- a/website/docs/sql-reference/aggregate-functions/sql-max.md +++ b/website/docs/sql-reference/aggregate-functions/sql-max.md @@ -25,6 +25,8 @@ Let’s take a look at a practical example using MAX and GROUP BY below. ### MAX example +The following example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop): + ```sql select date_part('month', order_date) as order_month, @@ -33,10 +35,6 @@ from {{ ref('orders') }} group by 1 ``` -:::note What dataset is this? -This example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop). -::: - This simple query is something you may do while doing initial exploration of your data; it will return the maximum order `amount` per order month that appear in the Jaffle Shop’s `orders` table: | order_month | max_amount | diff --git a/website/docs/sql-reference/aggregate-functions/sql-min.md b/website/docs/sql-reference/aggregate-functions/sql-min.md index 6080bb20c0b..95de0af8df3 100644 --- a/website/docs/sql-reference/aggregate-functions/sql-min.md +++ b/website/docs/sql-reference/aggregate-functions/sql-min.md @@ -27,6 +27,8 @@ Let’s take a look at a practical example using MIN below. ### MIN example +The following example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop): + ```sql select customer_id, @@ -37,10 +39,6 @@ group by 1 limit 3 ``` -:::note What dataset is this? -This example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop). -::: - This simple query is returning the first and last order date for a customer in the Jaffle Shop’s `orders` table: | customer_id | first_order_date | last_order_date | diff --git a/website/docs/sql-reference/aggregate-functions/sql-round.md b/website/docs/sql-reference/aggregate-functions/sql-round.md index bc9669e22cb..a080f5a63e5 100644 --- a/website/docs/sql-reference/aggregate-functions/sql-round.md +++ b/website/docs/sql-reference/aggregate-functions/sql-round.md @@ -24,11 +24,8 @@ In this function, you’ll need to input the *numeric* field or data you want ro ### SQL ROUND function example -:::note What dataset is this? -This example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop). -::: -You can round some of the numeric fields of the Jaffle Shop’s `orders` model using the following code: +You can round some of the numeric fields of the [Jaffle Shop’s](https://github.com/dbt-labs/jaffle_shop) `orders` model using the following code: ```sql select diff --git a/website/docs/sql-reference/aggregate-functions/sql-sum.md b/website/docs/sql-reference/aggregate-functions/sql-sum.md index d6ca00c2daa..494a3863ad3 100644 --- a/website/docs/sql-reference/aggregate-functions/sql-sum.md +++ b/website/docs/sql-reference/aggregate-functions/sql-sum.md @@ -27,6 +27,8 @@ Let’s take a look at a practical example using the SUM function below. ### SUM example +The following example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop): + ```sql select customer_id, @@ -36,10 +38,6 @@ group by 1 limit 3 ``` -:::note What dataset is this? -This example is querying from a sample dataset created by dbt Labs called [jaffle_shop](https://github.com/dbt-labs/jaffle_shop). -::: - This simple query is returning the summed amount of all orders for a customer in the Jaffle Shop’s `orders` table: | customer_id | all_orders_amount | diff --git a/website/docs/terms/data-extraction.md b/website/docs/terms/data-extraction.md index bc37b68cf66..52148a35421 100644 --- a/website/docs/terms/data-extraction.md +++ b/website/docs/terms/data-extraction.md @@ -37,7 +37,7 @@ Obviously, the type of business you work for and the systems your team uses will The data that is typically extracted and loaded in your data warehouse is data that business users will need for baseline reporting, OKR measurement, or other analytics. :::tip Don’t fix what’s not broken -As we just said, there are usually common data sources that data teams will extract from, regardless of business. Instead of writing transformations for these tables and data sources, leverage [dbt packages](https://hub.getdbt.com/) to save yourself some carpal tunnel and use the work someone else has already done for you :) +As we just said, there are usually common data sources that data teams will extract from, regardless of business. Instead of writing transformations for these tables and data sources, leverage [dbt packages](https://hub.getdbt.com/) to save yourself some carpal tunnel and use the work someone else has already done for you. ::: ## Data extraction tools diff --git a/website/docs/terms/table.md b/website/docs/terms/table.md index cbe36ec1315..bfc4e680660 100644 --- a/website/docs/terms/table.md +++ b/website/docs/terms/table.md @@ -5,9 +5,6 @@ description: "Read this guide to understand how tables work in dbt." displayText: table hoverSnippet: In simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. --- -:::important This page could use some love -This term would benefit from additional depth and examples. Have knowledge to contribute? [Create an issue in the docs.getdbt.com repository](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose) to begin the process of becoming a glossary contributor! -::: In simplest terms, a table is the direct storage of data in rows and columns. Think excel sheet with raw values in each of the cells. diff --git a/website/sidebars.js b/website/sidebars.js index 9085baac54a..5d950f43c26 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -212,6 +212,7 @@ const sidebarSettings = { "docs/core/connect-data-platform/decodable-setup", "docs/core/connect-data-platform/upsolver-setup", "docs/core/connect-data-platform/starrocks-setup", + "docs/core/connect-data-platform/extrica-setup", ], }, ], @@ -323,6 +324,7 @@ const sidebarSettings = { link: { type: "doc", id: "docs/build/metrics-overview" }, items: [ "docs/build/metrics-overview", + "docs/build/conversion", "docs/build/cumulative", "docs/build/derived", "docs/build/ratio", @@ -1028,6 +1030,8 @@ const sidebarSettings = { id: "best-practices/how-we-build-our-metrics/semantic-layer-1-intro", }, items: [ + "best-practices/how-we-build-our-metrics/semantic-layer-1-intro", + "best-practices/how-we-build-our-metrics/semantic-layer-2-setup", "best-practices/how-we-build-our-metrics/semantic-layer-3-build-semantic-models", "best-practices/how-we-build-our-metrics/semantic-layer-4-build-metrics", "best-practices/how-we-build-our-metrics/semantic-layer-5-refactor-a-mart", @@ -1045,6 +1049,7 @@ const sidebarSettings = { items: [ "best-practices/how-we-mesh/mesh-2-structures", "best-practices/how-we-mesh/mesh-3-implementation", + "best-practices/how-we-mesh/mesh-4-faqs", ], }, { diff --git a/website/snippets/_cloud-environments-info.md b/website/snippets/_cloud-environments-info.md index 6e096b83750..2083d8f07ec 100644 --- a/website/snippets/_cloud-environments-info.md +++ b/website/snippets/_cloud-environments-info.md @@ -1,4 +1,3 @@ - ## Types of environments In dbt Cloud, there are two types of environments: @@ -34,24 +33,6 @@ Both development and deployment environments have a section called **General Set - If you select a current version with `(latest)` in the name, your environment will automatically install the latest stable version of the minor version selected. ::: -### Git repository caching - -At the start of every job run, dbt Cloud clones the project's Git repository so it has the latest versions of your project's code and runs `dbt deps` to install your dependencies. - -For improved reliability and performance on your job runs, you can enable dbt Cloud to keep a cache of the project's Git repository. So, if there's a third-party outage that causes the cloning operation to fail, dbt Cloud will instead use the cached copy of the repo so your jobs can continue running as scheduled. - -dbt Cloud caches your project's Git repo after each successful run and retains it for 8 days if there are no repo updates. It caches all packages regardless of installation method and does not fetch code outside of the job runs. - -To enable Git repository caching, select **Account settings** from the gear menu and enable the **Repository caching** option. - - - -:::note - -This feature is only available on the dbt Cloud Enterprise plan. - -::: - ### Custom branch behavior By default, all environments will use the default branch in your repository (usually the `main` branch) when accessing your dbt code. This is overridable within each dbt Cloud Environment using the **Default to a custom branch** option. This setting have will have slightly different behavior depending on the environment type: @@ -65,7 +46,7 @@ For more info, check out this [FAQ page on this topic](/faqs/Environments/custom ### Extended attributes :::note -Extended attributes are retrieved and applied only at runtime when `profiles.yml` is requested for a specific Cloud run. Extended attributes are currently _not_ taken into consideration for Cloud-specific features such as PrivateLink or SSH Tunneling that do not rely on `profiles.yml` values. +Extended attributes are retrieved and applied only at runtime when `profiles.yml` is requested for a specific Cloud run. Extended attributes are currently _not_ taken into consideration for SSH Tunneling which do not rely on `profiles.yml` values. ::: Extended Attributes is a feature that allows users to set a flexible [profiles.yml](/docs/core/connect-data-platform/profiles.yml) snippet in their dbt Cloud Environment settings. It provides users with more control over environments (both deployment and development) and extends how dbt Cloud connects to the data platform within a given environment. @@ -92,3 +73,39 @@ schema: dbt_alice threads: 4 ``` +### Git repository caching + +At the start of every job run, dbt Cloud clones the project's Git repository so it has the latest versions of your project's code and runs `dbt deps` to install your dependencies. + +For improved reliability and performance on your job runs, you can enable dbt Cloud to keep a cache of the project's Git repository. So, if there's a third-party outage that causes the cloning operation to fail, dbt Cloud will instead use the cached copy of the repo so your jobs can continue running as scheduled. + +dbt Cloud caches your project's Git repo after each successful run and retains it for 8 days if there are no repo updates. It caches all packages regardless of installation method and does not fetch code outside of the job runs. + +dbt Cloud will use the cached copy of your project's Git repo under these circumstances: + +- Outages from third-party services (for example, the [dbt package hub](https://hub.getdbt.com/)). +- Git authentication fails. +- There are syntax errors in the `packages.yml` file. You can set up and use [continuous integration (CI)](/docs/deploy/continuous-integration) to find these errors sooner. +- If a package doesn't work with the current dbt version. You can set up and use [continuous integration (CI)](/docs/deploy/continuous-integration) to identify this issue sooner. + +To enable Git repository caching, select **Account settings** from the gear menu and enable the **Repository caching** option. + + + +:::note + +This feature is only available on the dbt Cloud Enterprise plan. + +::: + +### Partial parsing + +At the start of every dbt invocation, dbt reads all the files in your project, extracts information, and constructs an internal manifest containing every object (model, source, macro, and so on). Among other things, it uses the `ref()`, `source()`, and `config()` macro calls within models to set properties, infer dependencies, and construct your project's DAG. When dbt finishes parsing your project, it stores the internal manifest in a file called `partial_parse.msgpack`. + +Parsing projects can be time-consuming, especially for large projects with hundreds of models and thousands of files. To reduce the time it takes dbt to parse your project, use the partial parsing feature in dbt Cloud for your environment. When enabled, dbt Cloud uses the `partial_parse.msgpack` file to determine which files have changed (if any) since the project was last parsed, and then it parses _only_ the changed files and the files related to those changes. + +Partial parsing in dbt Cloud requires dbt version 1.4 or newer. The feature does have some known limitations. Refer to [Known limitations](/reference/parsing#known-limitations) to learn more about them. + +To enable, select **Account settings** from the gear menu and enable the **Partial parsing** option. + + \ No newline at end of file diff --git a/website/snippets/_enterprise-permissions-table.md b/website/snippets/_enterprise-permissions-table.md index 3eb313e0f5b..3303b167307 100644 --- a/website/snippets/_enterprise-permissions-table.md +++ b/website/snippets/_enterprise-permissions-table.md @@ -78,12 +78,12 @@ The project roles enable you to work within the projects in various capacities. | Credentials | W | W | W | W | R | W | | | | R | R | | | Custom env. variables | W | W | W | W | W | W | R | | | R | W | | | dbt adapters | W | W | W | W | R | W | | | | R | R | | -| Develop (IDE or dbt Cloud CLI) | W | W | | W | | | | | | | | | +| Develop
(IDE or dbt Cloud CLI) | W | W | | W | | | | | | | | | | Environments | W | R | R | R | R | W | R | | | R | R | | | Jobs | W | R | R | W | R | W | R | | | R | R | | | Metadata | R | R | R | R | R | R | R | R | | R | R | | -| Permissions | W | | R | R | R | | | | | | W | | -| Profile | W | R | W | R | R | R | | | | R | R | | +| Permissions (Groups & Licenses) | W | | R | R | R | | | | | | W | | +| Profile (Credentials) | W | R | W | R | R | R | | | | R | R | | | Projects | W | W | W | W | W | R | R | | | R | W | | | Repositories | W | | R | R | W | | | | | R | R | | | Runs | W | R | R | W | R | W | R | | | R | R | | diff --git a/website/snippets/_new-sl-setup.md b/website/snippets/_new-sl-setup.md index a02481db33d..a93f233d09c 100644 --- a/website/snippets/_new-sl-setup.md +++ b/website/snippets/_new-sl-setup.md @@ -1,14 +1,12 @@ You can set up the dbt Semantic Layer in dbt Cloud at the environment and project level. Before you begin: -- You must have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment. - - Single-tenant accounts should contact their account representative for necessary setup and enablement. - You must be part of the Owner group, and have the correct [license](/docs/cloud/manage-access/seats-and-users) and [permissions](/docs/cloud/manage-access/self-service-permissions) to configure the Semantic Layer: * Enterprise plan — Developer license with Account Admin permissions. Or Owner with a Developer license, assigned Project Creator, Database Admin, or Admin permissions. * Team plan — Owner with a Developer license. - You must have a successful run in your new environment. :::tip -If you've configured the legacy Semantic Layer, it has been deprecated, and dbt Labs strongly recommends that you [upgrade your dbt version](/docs/dbt-versions/upgrade-core-in-cloud) to dbt version 1.6 or higher to use the latest dbt Semantic Layer. Refer to the dedicated [migration guide](/guides/sl-migration) for details. +If you've configured the legacy Semantic Layer, it has been deprecated. dbt Labs strongly recommends that you [upgrade your dbt version](/docs/dbt-versions/upgrade-core-in-cloud) to dbt version 1.6 or higher to use the latest dbt Semantic Layer. Refer to the dedicated [migration guide](/guides/sl-migration) for details. ::: 1. In dbt Cloud, create a new [deployment environment](/docs/deploy/deploy-environments#create-a-deployment-environment) or use an existing environment on dbt 1.6 or higher. @@ -20,7 +18,10 @@ If you've configured the legacy Semantic Layer, it has been deprecated, and dbt -4. In the **Set Up Semantic Layer Configuration** page, enter the credentials you want the Semantic Layer to use specific to your data platform. We recommend credentials have the least privileges required because your Semantic Layer users will be querying it in downstream applications. At a minimum, the Semantic Layer needs to have read access to the schema(s) that contains the dbt models that you used to build your semantic models. +4. In the **Set Up Semantic Layer Configuration** page, enter the credentials you want the Semantic Layer to use specific to your data platform. + + - Use credentials with minimal privileges. This is because the Semantic Layer requires read access to the schema(s) containing the dbt models used in your semantic models for downstream applications + - Note, [Environment variables](/docs/build/environment-variables) such as `{{env_var('DBT_WAREHOUSE')}`, doesn't supported the dbt Semantic Layer yet. You must use the actual credentials. @@ -28,13 +29,10 @@ If you've configured the legacy Semantic Layer, it has been deprecated, and dbt 6. After saving it, you'll be provided with the connection information that allows you to connect to downstream tools. If your tool supports JDBC, save the JDBC URL or individual components (like environment id and host). If it uses the GraphQL API, save the GraphQL API host information instead. - + 7. Save and copy your environment ID, service token, and host, which you'll need to use downstream tools. For more info on how to integrate with partner integrations, refer to [Available integrations](/docs/use-dbt-semantic-layer/avail-sl-integrations). 8. Return to the **Project Details** page, then select **Generate Service Token**. You will need Semantic Layer Only and Metadata Only [service token](/docs/dbt-cloud-apis/service-tokens) permissions. - - -Great job, you've configured the Semantic Layer 🎉! - +Great job, you've configured the Semantic Layer 🎉! diff --git a/website/snippets/_packages_or_dependencies.md b/website/snippets/_packages_or_dependencies.md index 5cc4c67e63c..61014bc2b1a 100644 --- a/website/snippets/_packages_or_dependencies.md +++ b/website/snippets/_packages_or_dependencies.md @@ -12,7 +12,7 @@ There are some important differences between Package dependencies and Project de -Project dependencies are designed for the [dbt Mesh](/best-practices/how-we-mesh/mesh-1-intro) and [cross-project reference](/docs/collaborate/govern/project-dependencies#how-to-use-ref) workflow: +Project dependencies are designed for the [dbt Mesh](/best-practices/how-we-mesh/mesh-1-intro) and [cross-project reference](/docs/collaborate/govern/project-dependencies#how-to-write-cross-project-ref) workflow: - Use `dependencies.yml` when you need to set up cross-project references between different dbt projects, especially in a dbt Mesh setup. - Use `dependencies.yml` when you want to include both projects and non-private dbt packages in your project's dependencies. diff --git a/website/snippets/_privatelink-hostname-restriction.md b/website/snippets/_privatelink-hostname-restriction.md new file mode 100644 index 00000000000..a4bcd318a15 --- /dev/null +++ b/website/snippets/_privatelink-hostname-restriction.md @@ -0,0 +1,5 @@ +:::caution Environment variables + +Using [Environment variables](/docs/build/environment-variables) when configuring PrivateLink endpoints isn't supported in dbt Cloud. Instead, use [Extended Attributes](/docs/deploy/deploy-environments#extended-attributes) to dynamically change these values in your dbt Cloud environment. + +::: diff --git a/website/snippets/_sl-define-metrics.md b/website/snippets/_sl-define-metrics.md index af3ee9f297f..fe169b4a5b4 100644 --- a/website/snippets/_sl-define-metrics.md +++ b/website/snippets/_sl-define-metrics.md @@ -1,6 +1,6 @@ Now that you've created your first semantic model, it's time to define your first metric! You can define metrics with the dbt Cloud IDE or command line. -MetricFlow supports different metric types like [simple](/docs/build/simple), [ratio](/docs/build/ratio), [cumulative](/docs/build/cumulative), and [derived](/docs/build/derived). It's recommended that you read the [metrics overview docs](/docs/build/metrics-overview) before getting started. +MetricFlow supports different metric types like [conversion](/docs/build/conversion), [simple](/docs/build/simple), [ratio](/docs/build/ratio), [cumulative](/docs/build/cumulative), and [derived](/docs/build/derived). It's recommended that you read the [metrics overview docs](/docs/build/metrics-overview) before getting started. 1. You can define metrics in the same YAML files as your semantic models or create a new file. If you want to create your metrics in a new file, create another directory called `/models/metrics`. The file structure for metrics can become more complex from here if you need to further organize your metrics, for example, by data source or business line. diff --git a/website/snippets/_upgrade-move.md b/website/snippets/_upgrade-move.md deleted file mode 100644 index 7572077fd1b..00000000000 --- a/website/snippets/_upgrade-move.md +++ /dev/null @@ -1,5 +0,0 @@ -:::important Upgrade Guides Are Moving - -The location of the dbt Core upgrade guides has changed, and they will soon be removed from `Guides`. The new location is in the `Docs` tab under `Available dbt versions`. You have been redirected to the new URL, so please update any saved links and bookmarks. - -::: \ No newline at end of file diff --git a/website/snippets/_v2-sl-prerequisites.md b/website/snippets/_v2-sl-prerequisites.md index 99d8a945db6..18f228ad3fe 100644 --- a/website/snippets/_v2-sl-prerequisites.md +++ b/website/snippets/_v2-sl-prerequisites.md @@ -7,4 +7,4 @@ - Set up the [Semantic Layer API](/docs/dbt-cloud-apis/sl-api-overview) in the integrated tool to import metric definitions. - dbt Core or Developer accounts can define metrics but won't be able to dynamically query them.
- Understand [MetricFlow's](/docs/build/about-metricflow) key concepts, which powers the latest dbt Semantic Layer. -- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections, [PrivateLink](/docs/cloud/secure/about-privatelink), and [Single sign-on (SSO)](/docs/cloud/manage-access/sso-overview) doesn't supported the dbt Semantic Layer yet. +- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections, [PrivateLink](/docs/cloud/secure/about-privatelink), and [Single sign-on (SSO)](/docs/cloud/manage-access/sso-overview) doesn't support the dbt Semantic Layer yet. diff --git a/website/snippets/available-beta-banner.md b/website/snippets/available-beta-banner.md deleted file mode 100644 index 15d365a84b1..00000000000 --- a/website/snippets/available-beta-banner.md +++ /dev/null @@ -1,3 +0,0 @@ -:::info Beta feature -This feature is currently in beta and subject to change. If you are interested in getting access to the beta, please [contact us](mailto:support@getdbt.com). -::: diff --git a/website/snippets/available-prerelease-banner.md b/website/snippets/available-prerelease-banner.md deleted file mode 100644 index 3531a2f646f..00000000000 --- a/website/snippets/available-prerelease-banner.md +++ /dev/null @@ -1,7 +0,0 @@ -:::info Release candidate -dbt Core v1.2 is now available as a **release candidate**. - -For more information on prereleases, see ["About Core versions: Trying prereleases"](core-versions#trying-prereleases). - -Join the [#dbt-prereleases](https://getdbt.slack.com/archives/C016X6ABVUK) channel in the Community Slack so you can be the first to read about prereleases as soon as they're available! -::: diff --git a/website/snippets/quickstarts/schedule-a-job.md b/website/snippets/quickstarts/schedule-a-job.md index ab8f4350dbf..70848388f35 100644 --- a/website/snippets/quickstarts/schedule-a-job.md +++ b/website/snippets/quickstarts/schedule-a-job.md @@ -35,9 +35,9 @@ As the `jaffle_shop` business gains more customers, and those customers create m 8. Click the run and watch its progress under "Run history." 9. Once the run is complete, click **View Documentation** to see the docs for your project. -:::tip + Congratulations 🎉! You've just deployed your first dbt project! -::: + #### FAQs diff --git a/website/snippets/sl-considerations-banner.md b/website/snippets/sl-considerations-banner.md deleted file mode 100644 index 33cfb5edac5..00000000000 --- a/website/snippets/sl-considerations-banner.md +++ /dev/null @@ -1,8 +0,0 @@ -:::caution Considerations - -Some important considerations to know about using the dbt Semantic Layer during the Public Preview: - -- Support for Snowflake data platform only (_additional data platforms coming soon_) -- Support for the deployment environment only (_development experience coming soon_) - -::: diff --git a/website/snippets/test-snippet.md b/website/snippets/test-snippet.md deleted file mode 100644 index c1de326aa7a..00000000000 --- a/website/snippets/test-snippet.md +++ /dev/null @@ -1,8 +0,0 @@ ---- ---- - -### Header 2 - -Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam fermentum porttitor dui, id scelerisque enim scelerisque at. Proin imperdiet sem sed magna ornare, sit amet rutrum ligula vehicula. Aenean eget magna placerat, dictum velit sed, dapibus quam. Maecenas lectus tellus, dictum semper gravida vel, feugiat vitae nibh. Vestibulum nec lorem nibh. Fusce nisi felis, tincidunt ac scelerisque ut, aliquam in eros. Praesent euismod dolor ac lacinia laoreet. Phasellus orci orci, scelerisque vitae mollis id, consectetur ut libero. Aenean diam leo, tempor ut vulputate in, laoreet id ipsum. Quisque gravida et ex id eleifend. Etiam ultricies erat diam. Morbi hendrerit, ligula non aliquam tempus, erat elit suscipit quam, eu cursus quam nisi sit amet dui. Cras iaculis risus vel enim tempor molestie. - -Curabitur a porttitor odio. Curabitur sit amet tristique ante. Ut eleifend, erat eget imperdiet accumsan, quam diam sodales dolor, vulputate consequat lacus felis non sapien. Nam et nunc sed diam congue rutrum nec non massa. Nam eget fermentum sem. Nam ac imperdiet massa. Phasellus a elementum dui. diff --git a/website/snippets/tutorial-create-new-dbt-cloud-account.md b/website/snippets/tutorial-create-new-dbt-cloud-account.md deleted file mode 100644 index bdde874d0c9..00000000000 --- a/website/snippets/tutorial-create-new-dbt-cloud-account.md +++ /dev/null @@ -1,10 +0,0 @@ -Let's start this section by creating a dbt Cloud account if you haven't already. - -1. Navigate to [dbt Cloud](https://cloud.getdbt.com). -2. If you don't have a dbt Cloud account, create a new one, and verify your account via email. -3. If you already have a dbt Cloud account, you can create a new project from your existing account: - 1. Click the gear icon in the top-right, then click **Projects**. - 2. Click **+ New Project**. -4. You've arrived at the "Setup a New Project" page. -5. Type "Analytics" in the dbt Project Name field. You will be able to rename this project later. -6. Click **Continue**. \ No newline at end of file diff --git a/website/snippets/tutorial-initiate-project.md b/website/snippets/tutorial-initiate-project.md deleted file mode 100644 index 008b6bdf487..00000000000 --- a/website/snippets/tutorial-initiate-project.md +++ /dev/null @@ -1,44 +0,0 @@ -Now that you have a repository configured, you can initialize your project and start development in dbt Cloud: - -1. Click **Develop** from the upper left. It might take a few minutes for your project to spin up for the first time as it establishes your git connection, clones your repo, and tests the connection to the warehouse. -2. Above the file tree to the left, click **Initialize your project**. This builds out your folder structure with example models. -3. Make your initial commit by clicking **Commit**. Use the commit message `initial commit`. This creates the first commit to your managed repo and allows you to open a branch where you can add new dbt code. -4. Now you should be able to **directly query data from your warehouse** and **execute dbt run**. Paste your following warehouse-specific code in the IDE: - - - -
- -```sql -select * from `dbt-tutorial.jaffle_shop.customers` -``` - -
- -
- -```sql -select * from default.jaffle_shop_customers -``` - -
- -
- -```sql -select * from jaffle_shop.customers -``` - -
- -
- -```sql -select * from raw.jaffle_shop.customers -``` - -
- -
- -- In the command line bar at the bottom, type `dbt run` and click **Enter**. We will explore what happens in the next section of the tutorial. diff --git a/website/src/components/detailsToggle/index.js b/website/src/components/detailsToggle/index.js index ba53192e54b..514ca52ba13 100644 --- a/website/src/components/detailsToggle/index.js +++ b/website/src/components/detailsToggle/index.js @@ -3,33 +3,50 @@ import styles from './styles.module.css'; function detailsToggle({ children, alt_header = null }) { const [isOn, setOn] = useState(false); - const [hoverActive, setHoverActive] = useState(true); + const [isScrolling, setIsScrolling] = useState(false); // New state to track scrolling const [hoverTimeout, setHoverTimeout] = useState(null); const handleToggleClick = () => { - setHoverActive(true); // Disable hover when clicked setOn(current => !current); // Toggle the current state -}; - -const handleMouseEnter = () => { - if (isOn) return; // Ignore hover if already open - setHoverActive(true); // Enable hover - const timeout = setTimeout(() => { - if (hoverActive) setOn(true); - }, 500); - setHoverTimeout(timeout); -}; - -const handleMouseLeave = () => { - if (!isOn) { + }; + + const handleMouseEnter = () => { + if (isOn || isScrolling) return; // Ignore hover if already open or if scrolling + const timeout = setTimeout(() => { + if (!isScrolling) setOn(true); + }, 700); // + setHoverTimeout(timeout); + }; + + const handleMouseLeave = () => { + if (!isOn) { clearTimeout(hoverTimeout); setOn(false); - } -}; + } + }; + + const handleScroll = () => { + setIsScrolling(true); + clearTimeout(hoverTimeout); + //setOn(false); + + + // Reset scrolling state after a delay + setTimeout(() => { + setIsScrolling(false); + }, 800); + }; + + useEffect(() => { + window.addEventListener('scroll', handleScroll); + return () => { + window.removeEventListener('scroll', handleScroll); + }; + }, []); -useEffect(() => { - return () => clearTimeout(hoverTimeout); -}, [hoverTimeout]); + useEffect(() => { + return () => clearTimeout(hoverTimeout); + }, [hoverTimeout]); return (
@@ -40,7 +57,7 @@ useEffect(() => { onMouseLeave={handleMouseLeave} >   - {alt_header} + {alt_header} {/* Visual disclaimer */} Hover to view diff --git a/website/src/components/detailsToggle/styles.module.css b/website/src/components/detailsToggle/styles.module.css index 446d3197128..b3f4a4886dc 100644 --- a/website/src/components/detailsToggle/styles.module.css +++ b/website/src/components/detailsToggle/styles.module.css @@ -1,9 +1,11 @@ -:local(.link) { +:local(.link) :local(.headerText) { color: var(--ifm-link-color); - transition: background-color 0.3s; /* Smooth transition for background color */ + text-decoration: none; + transition: text-decoration 0.3s; /* Smooth transition */ } -:local(.link:hover), :local(.link:focus) { +:local(.link:hover) :local(.headerText), +:local(.link:focus) :local(.headerText) { text-decoration: underline; cursor: pointer; } @@ -12,6 +14,7 @@ font-size: 0.8em; color: #666; margin-left: 10px; /* Adjust as needed */ + text-decoration: none; } :local(.toggle) { @@ -24,6 +27,7 @@ width: 1.25rem; vertical-align: middle; transition: transform 0.3s; /* Smooth transition for toggle icon */ + } :local(.toggleUpsideDown) { diff --git a/website/src/components/faqs/index.js b/website/src/components/faqs/index.js index 52c4573d883..0741a29cd89 100644 --- a/website/src/components/faqs/index.js +++ b/website/src/components/faqs/index.js @@ -3,10 +3,10 @@ import styles from './styles.module.css'; import { usePluginData } from '@docusaurus/useGlobalData'; function FAQ({ path, alt_header = null }) { - const [isOn, setOn] = useState(false); - const [filePath, setFilePath] = useState(path) - const [fileContent, setFileContent] = useState({}) + const [filePath, setFilePath] = useState(path); + const [fileContent, setFileContent] = useState({}); + const [hoverTimeout, setHoverTimeout] = useState(null); // Get all faq file paths from plugin const { faqFiles } = usePluginData('docusaurus-build-global-data-plugin'); @@ -37,24 +37,45 @@ function FAQ({ path, alt_header = null }) { } }, [filePath]) - const toggleOn = function () { - setOn(!isOn); + const handleMouseEnter = () => { + setHoverTimeout(setTimeout(() => { + setOn(true); + }, 500)); + }; + + const handleMouseLeave = () => { + if (!isOn) { + clearTimeout(hoverTimeout); + setOn(false); } +}; + + useEffect(() => { + return () => { + if (hoverTimeout) { + clearTimeout(hoverTimeout); + } + }; + }, [hoverTimeout]); + + const toggleOn = () => { + if (hoverTimeout) { + clearTimeout(hoverTimeout); + } + setOn(!isOn); + }; return ( -
+
- -   - {alt_header || fileContent?.meta && fileContent.meta.title} - -
- {fileContent?.contents && fileContent.contents} + + {alt_header || (fileContent?.meta && fileContent.meta.title)} + Hover to view + +
+ {fileContent?.contents}
-
+
); } diff --git a/website/src/components/faqs/styles.module.css b/website/src/components/faqs/styles.module.css index e19156a3a7b..c179aa85cdc 100644 --- a/website/src/components/faqs/styles.module.css +++ b/website/src/components/faqs/styles.module.css @@ -1,9 +1,12 @@ -:local(.link) { +:local(.link) :local(.headerText) { color: var(--ifm-link-color); + text-decoration: none; + transition: text-decoration 0.3s; /* Smooth transition */ } -:local(.link:hover) { +:local(.link:hover) :local(.headerText), +:local(.link:focus) :local(.headerText) { text-decoration: underline; cursor: pointer; } @@ -24,6 +27,13 @@ filter: invert(1); } +:local(.disclaimer) { + font-size: 0.8em; + color: #666; + margin-left: 10px; /* Adjust as needed */ + text-decoration: none; +} + :local(.body) { margin-left: 2em; margin-bottom: 10px; diff --git a/website/src/components/lightbox/index.js b/website/src/components/lightbox/index.js index 1c748bbb04f..a846c51b150 100644 --- a/website/src/components/lightbox/index.js +++ b/website/src/components/lightbox/index.js @@ -1,34 +1,65 @@ -import React from 'react'; +import React, { useState, useEffect } from 'react'; import styles from './styles.module.css'; import imageCacheWrapper from '../../../functions/image-cache-wrapper'; -function Lightbox({ - src, - collapsed = false, - alignment = "center", - alt = undefined, - title = undefined, - width = undefined, -}) { - - // Set alignment class if alignment prop used - let imageAlignment = '' - if(alignment === "left") { - imageAlignment = styles.leftAlignLightbox - } else if(alignment === "right") { - imageAlignment = styles.rightAlignLightbox - } +function Lightbox({ src, collapsed = false, alignment = "center", alt = undefined, title = undefined, width = undefined }) { + const [isHovered, setIsHovered] = useState(false); + const [expandImage, setExpandImage] = useState(false); + const [isScrolling, setIsScrolling] = useState(false); + + useEffect(() => { + let timeoutId; + if (isHovered && !isScrolling) { + timeoutId = setTimeout(() => { + setExpandImage(true); + }, 300); + } + return () => clearTimeout(timeoutId); + }, [isHovered, isScrolling]); + + const handleMouseEnter = () => { + setTimeout(() => { + if (!isScrolling) { + setIsHovered(true); + } + }, 300); + }; + + const handleMouseLeave = () => { + setIsHovered(false); + setExpandImage(false); + }; + + const handleScroll = () => { + setIsScrolling(true); + setExpandImage(false); + + setTimeout(() => { + setIsScrolling(false); + }, 300); // Delay to reset scrolling state + }; + + useEffect(() => { + window.addEventListener('scroll', handleScroll); + return () => { + window.removeEventListener('scroll', handleScroll); + }; + }, []); return ( <> - @@ -37,13 +68,14 @@ function Lightbox({ alt={alt ? alt : title ? title : ''} title={title ? title : ''} src={imageCacheWrapper(src)} + style={expandImage ? { transform: 'scale(1.2)', transition: 'transform 0.5s ease', zIndex: '9999' } : {}} /> {title && ( { title } )} - +
); } diff --git a/website/src/components/lightbox/styles.module.css b/website/src/components/lightbox/styles.module.css index af0bb086cf5..1f50a2f0427 100644 --- a/website/src/components/lightbox/styles.module.css +++ b/website/src/components/lightbox/styles.module.css @@ -10,7 +10,7 @@ margin: 10px auto; padding-right: 10px; display: block; - max-width: 400px; + max-width: 80%; } :local(.collapsed) { @@ -24,3 +24,9 @@ .rightAlignLightbox { margin: 10px 0 10px auto; } + +:local(.hovered) { + filter: drop-shadow(4px 4px 6px #aaaaaae1); + transition: transform 0.3s ease; + z-index: 9999; +} diff --git a/website/static/img/blog/2024-01-09-defer-in-development/defer-toggle.png b/website/static/img/blog/2024-01-09-defer-in-development/defer-toggle.png new file mode 100644 index 00000000000..7161dc68b93 Binary files /dev/null and b/website/static/img/blog/2024-01-09-defer-in-development/defer-toggle.png differ diff --git a/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-defer.png b/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-defer.png new file mode 100644 index 00000000000..7ec96a7b598 Binary files /dev/null and b/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-defer.png differ diff --git a/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-full.png b/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-full.png new file mode 100644 index 00000000000..4381a13abed Binary files /dev/null and b/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-full.png differ diff --git a/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-mixed.png b/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-mixed.png new file mode 100644 index 00000000000..1020c3b65f0 Binary files /dev/null and b/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-mixed.png differ diff --git a/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-model-c.png b/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-model-c.png new file mode 100644 index 00000000000..3f48255ac12 Binary files /dev/null and b/website/static/img/blog/2024-01-09-defer-in-development/prod-and-dev-model-c.png differ diff --git a/website/static/img/blog/2024-01-09-defer-in-development/prod-environment-plain.png b/website/static/img/blog/2024-01-09-defer-in-development/prod-environment-plain.png new file mode 100644 index 00000000000..5c2860411ec Binary files /dev/null and b/website/static/img/blog/2024-01-09-defer-in-development/prod-environment-plain.png differ diff --git a/website/static/img/blog/2024-01-09-defer-in-development/willem.png b/website/static/img/blog/2024-01-09-defer-in-development/willem.png new file mode 100644 index 00000000000..bd38e9b0bd4 Binary files /dev/null and b/website/static/img/blog/2024-01-09-defer-in-development/willem.png differ diff --git a/website/static/img/blog/authors/ejohnston.png b/website/static/img/blog/authors/ejohnston.png new file mode 100644 index 00000000000..09fc4ed7ba3 Binary files /dev/null and b/website/static/img/blog/authors/ejohnston.png differ diff --git a/website/static/img/blog/serverless-free-tier-data-stack-with-dlt-and-dbt-core/architecture_diagram.png b/website/static/img/blog/serverless-free-tier-data-stack-with-dlt-and-dbt-core/architecture_diagram.png new file mode 100644 index 00000000000..ad10d32c2e7 Binary files /dev/null and b/website/static/img/blog/serverless-free-tier-data-stack-with-dlt-and-dbt-core/architecture_diagram.png differ diff --git a/website/static/img/blog/serverless-free-tier-data-stack-with-dlt-and-dbt-core/map_screenshot.png b/website/static/img/blog/serverless-free-tier-data-stack-with-dlt-and-dbt-core/map_screenshot.png new file mode 100644 index 00000000000..da8309c2510 Binary files /dev/null and b/website/static/img/blog/serverless-free-tier-data-stack-with-dlt-and-dbt-core/map_screenshot.png differ diff --git a/website/static/img/docs/dbt-cloud/semantic-layer/conversion-metrics-fill-null.png b/website/static/img/docs/dbt-cloud/semantic-layer/conversion-metrics-fill-null.png new file mode 100644 index 00000000000..0fd5e206ba7 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/semantic-layer/conversion-metrics-fill-null.png differ diff --git a/website/static/img/docs/deploy/example-account-settings.png b/website/static/img/docs/deploy/example-account-settings.png new file mode 100644 index 00000000000..12b8d9bc49f Binary files /dev/null and b/website/static/img/docs/deploy/example-account-settings.png differ diff --git a/website/static/img/docs/deploy/example-repo-caching.png b/website/static/img/docs/deploy/example-repo-caching.png deleted file mode 100644 index 805d845dccb..00000000000 Binary files a/website/static/img/docs/deploy/example-repo-caching.png and /dev/null differ diff --git a/website/vercel.json b/website/vercel.json index 35799e24061..1e4cc2fb021 100644 --- a/website/vercel.json +++ b/website/vercel.json @@ -2,6 +2,11 @@ "cleanUrls": true, "trailingSlash": false, "redirects": [ + { + "source": "/reference/profiles.yml", + "destination": "/docs/core/connect-data-platform/profiles.yml", + "permanent": true + }, { "source": "/docs/cloud/dbt-cloud-ide/dbt-cloud-tips", "destination": "/docs/build/dbt-tips", @@ -3633,8 +3638,8 @@ "permanent": true }, { - "source": "/docs/writing-code-in-dbt/jinja-context/as_text", - "destination": "/reference/dbt-jinja-functions/as_text", + "source": "/reference/dbt-jinja-functions/as_text", + "destination": "/reference/dbt-jinja-functions", "permanent": true }, { @@ -3847,11 +3852,6 @@ "destination": "/dbt-cloud/api", "permanent": true }, - { - "source": "/reference/data-test-configs", - "destination": "/reference/test-configs", - "permanent": true - }, { "source": "/reference/declaring-properties", "destination": "/reference/configs-and-properties",