Skip to content

Commit

Permalink
Merge branch 'current' into staging
Browse files Browse the repository at this point in the history
  • Loading branch information
matthewshaver authored Apr 12, 2024
2 parents 38dff43 + d41f914 commit 39a6163
Show file tree
Hide file tree
Showing 11 changed files with 121 additions and 59 deletions.
24 changes: 24 additions & 0 deletions website/docs/docs/build/python-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -630,6 +630,30 @@ def model(dbt, session):
)
```
<VersionBlock firstVersion="1.8">
**External access integrations and secrets**: To query external APIs within dbt Python models, use Snowflake’s [external access](https://docs.snowflake.com/en/developer-guide/external-network-access/external-network-access-overview) together with [secrets](https://docs.snowflake.com/en/developer-guide/external-network-access/secret-api-reference). Here are some additional configurations you can use:
```
import pandas
import snowflake.snowpark as snowpark

def model(dbt, session: snowpark.Session):
dbt.config(
materialized="table",
secrets={"secret_variable_name": "test_secret"},
external_access_integrations=["test_external_access_integration"],
)
import _snowflake
return session.create_dataframe(
pandas.DataFrame(
[{"secret_value": _snowflake.get_generic_secret_string('secret_variable_name')}]
)
)
```
</VersionBlock>
**About "sprocs":** dbt submits Python models to run as _stored procedures_, which some people call _sprocs_ for short. By default, dbt will create a named sproc containing your model's compiled Python code, and then _call_ it to execute. Snowpark has an Open Preview feature for _temporary_ or _anonymous_ stored procedures ([docs](https://docs.snowflake.com/en/sql-reference/sql/call-with.html)), which are faster and leave a cleaner query history. You can switch this feature on for your models by configuring `use_anonymous_sproc: True`. We plan to switch this on for all dbt + Snowpark Python models starting with the release of dbt Core version 1.4.
<File name='dbt_project.yml'>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ dbt Cloud is [hosted](/docs/cloud/about-cloud/architecture) in multiple regions

| Region | Location | Access URL | IP addresses | Developer plan | Team plan | Enterprise plan |
|--------|----------|------------|--------------|----------------|-----------|-----------------|
| North America [^1] | AWS us-east-1 (N. Virginia) | **Multi-tenant:** cloud.getdbt.com <br /> **Cell based:** {account prefix}.us1.dbt.com | 52.45.144.63 <br /> 54.81.134.249 <br />52.22.161.231 <br />52.3.77.232 <br />3.214.191.130 <br />34.233.79.135 ||||
| North America [^1] | AWS us-east-1 (N. Virginia) | **Multi-tenant:** cloud.getdbt.com <br /> **Cell based:** ACCOUNT_PREFIX.us1.dbt.com | 52.45.144.63 <br /> 54.81.134.249 <br />52.22.161.231 <br />52.3.77.232 <br />3.214.191.130 <br />34.233.79.135 ||||
| EMEA [^1] | AWS eu-central-1 (Frankfurt) | emea.dbt.com | 3.123.45.39 <br /> 3.126.140.248 <br /> 3.72.153.148 ||||
| APAC [^1] | AWS ap-southeast-2 (Sydney)| au.dbt.com | 52.65.89.235 <br /> 3.106.40.33 <br /> 13.239.155.206 <br />||||
| Virtual Private dbt or Single tenant | Customized | Customized | Ask [Support](/community/resources/getting-help#dbt-cloud-support) for your IPs ||||
Expand Down
38 changes: 19 additions & 19 deletions website/docs/docs/cloud/manage-access/set-up-databricks-oauth.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,31 +24,31 @@ Current limitations:

### Configure Databricks OAuth (Databricks admin)

To get started, you will need to [add dbt as an OAuth application](https://docs.databricks.com/en/integrations/configure-oauth-dbt.html) with Databricks, in 2 steps:
To get started, you will need to [add dbt as an OAuth application](https://docs.databricks.com/en/integrations/configure-oauth-dbt.html) with Databricks. There are two ways of configuring this application (CLI or Databricks UI). Here's how you can set this up in the Databricks UI:

1. From your terminal, [authenticate to the Databricks Account API](https://docs.databricks.com/en/integrations/configure-oauth-dbt.html#authenticate-to-the-account-api) with the Databricks CLI. You authenticate using:
- OAuth for users ([prerequisites](https://docs.databricks.com/en/dev-tools/auth.html#oauth-u2m-auth))
- Oauth for service principals ([prerequisites](https://docs.databricks.com/en/dev-tools/auth.html#oauth-m2m-auth))
- Username and password (must be account admin)
2. In the same terminal, **add dbt Cloud as an OAuth application** using `curl` and the [OAuth Custom App Integration API](https://docs.databricks.com/api/account/customappintegration/create)
1. Log in to the [account console](https://accounts.cloud.databricks.com/?_ga=2.255771976.118201544.1712797799-1002575874.1704693634) and click the **Settings** icon in the sidebar.

For the second step, you can use this example `curl` to authenticate with your username and password, replacing values as defined in the following table:
2. On the **App connections** tab, click **Add connection**.

```shell
curl -u USERNAME:PASSWORD https://accounts.cloud.databricks.com/api/2.0/accounts/ACCOUNT_ID/oauth2/custom-app-integrations -d '{"redirect_urls": ["https://YOUR_ACCESS_URL", "https://YOUR_ACCESS_URL/complete/databricks"], "confidential": true, "name": "NAME", "scopes": ["sql", "offline_access"]}'
```
3. Enter the following details:
- A name for your connection.
- The redirect URLs for your OAuth connection, which you can find in the table later in this section.
- For Access scopes, the APIs the application should have access to:
- For BI applications, the SQL scope is required to allow the connected app to access Databricks SQL APIs (this is required for SQL models).
- For applications that need to access Databricks APIs for purposes other than querying, the ALL APIs scope is required (this is required if running Python models).
- The access token time-to-live (TTL) in minutes. Default: 60.
- The refresh token time-to-live (TTL) in minutes. Default: 10080.
4. Select **Generate a client secret**. Copy and securely store the client secret. The client secret will not be available later.

These parameters and descriptions will help you authenticate with your username and password:
You can use the following table to set up the redirect URLs for your application, replacing ACCOUNT_PREFIX with the cell 1 prefix for your region and INSTANCE_NAME with the custom name of your instance:

| Parameter | Description |
| Region | Redirect URLs |
| ------ | ----- |
| **USERNAME** | Your Databricks username (account admin level) |
| **PASSWORD** | Your Databricks password (account admin level) |
| **ACCOUNT_ID** | Your Databricks [account ID](https://docs.databricks.com/en/administration-guide/account-settings/index.html#locate-your-account-id) |
| **YOUR_ACCESS_URL** | The [appropriate Access URL](/docs/cloud/about-cloud/access-regions-ip-addresses) for your dbt Cloud account region and plan |
| **NAME** | The integration name (i.e 'databricks-dbt-cloud')

After running the `curl`, you'll get an API response that includes the `client_id` and `client_secret` required in the following section. At this time, this is the only way to retrieve the secret. If you lose the secret, then the integration needs to be [deleted](https://docs.databricks.com/api/account/customappintegration/delete) and re-created.
| **US multi-tenant** | https://cloud.getdbt.com/callback <br /> https://cloud.getdbt.com/complete/databricks |
| **US cell 1** | https://ACCOUNT_PREFIX.us1.dbt.com/callback <br /> https://ACCOUNT_PREFIX.us1.dbt.com/complete/databricks |
| **EMEA** | https://emea.dbt.com/callback <br /> https://emea.dbt.com/complete/databricks |
| **APAC** | https://au.dbt.com/callback <br /> https://au.dbt.com/complete/databricks |
| **Single tenant** | https://INSTANCE_NAME.getdbt.com/callback <br /> https://INSTANCE_NAME.getdbt.com/complete/databricks


### Configure the Connection in dbt Cloud (dbt Cloud project admin)
Expand Down
13 changes: 6 additions & 7 deletions website/docs/docs/collaborate/git/version-control-basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,24 +54,23 @@ Refer to [merge conflicts](/docs/collaborate/git/merge-conflicts) to learn how t

## The .gitignore file

To make sure dbt Cloud runs smoothly, you must exclude certain sub-folders in your git repository containing your dbt project from being tracked by git. You can achieve this by adding three lines to a special file named [.gitignore](https://github.com/dbt-labs/dbt-starter-project/blob/main/.gitignore). This file is placed in the root folder of your dbt project.

Some git providers will automatically create a 'boilerplate' `.gitignore` file when the repository is created. However, based on dbt Labs' experience, these default `.gitignore` files typically don't include the required entries for dbt Cloud to function correctly.

The `.gitignore` file can include unrelated files and folders if the code repository requires it. However, the following folders must be included in the `gitignore` file to ensure dbt Cloud operates smoothly:
dbt Cloud implements a global [`.gitignore file`](https://github.com/dbt-labs/dbt-starter-project/blob/main/.gitignore) that automatically excludes the following sub-folders from your git repository to ensure smooth operation:

```
dbt_packages/
logs/
target/
```

**Note** &mdash; By using a trailing slash, these lines in the `gitignore` file serve as 'folder wildcards', excluding all files and folders within those folders from being tracked by git.
This inclusion uses a trailing slash, making these lines in the `.gitignore` file act as 'folder wildcards' that prevent any files or folders within them from being tracked by git. You can also specify additional exclusions as needed for your project.

However, this global `.gitignore` _does not_ apply to dbt Core and dbt Cloud CLI users directly. Therefore, if you're working with dbt Core or dbt Cloud CLI, you need to manually add the three lines mentioned previously to your project's `.gitignore` file.

It's worth noting that while some git providers generate a basic `.gitignore` file when the repository is created, these often lack the necessary exclusions for dbt Cloud. This means it's important to ensure you add the three lines mentioned previously in your `.gitignore` to ensure dbt Cloud operates smoothly.

:::note

- **dbt Cloud projects created after Dec 1, 2022** &mdash; If you use the **Initialize dbt Project** button in the dbt Cloud IDE to setup a new and empty dbt project, dbt Cloud will automatically add a `.gitignore` file with the required entries. If a `.gitignore` file already exists, the necessary folders will be appended to the existing file.
- **dbt Cloud projects created after Dec 1, 2022** &mdash; If you use the **Initialize dbt Project** button in the dbt Cloud IDE to set up a new and empty dbt project, dbt Cloud will automatically add a `.gitignore` file with the required entries. If a `.gitignore` file already exists, the necessary folders will be appended to the existing file.

- **Migrating project from Core to dbt Cloud** &mdash; Make sure you check the `.gitignore` file contains the necessary entries. dbt Core doesn't interact with git so dbt Cloud doesn't automatically add or verify entries in the `.gitignore` file. Additionally, if the repository already contains dbt code and doesn't require initialization, dbt Cloud won't add any missing entries to the .gitignore file.
:::
Expand Down
14 changes: 11 additions & 3 deletions website/docs/docs/deploy/deploy-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,13 +86,21 @@ dbt Cloud uses [Coordinated Universal Time](https://en.wikipedia.org/wiki/Coordi

To fully customize the scheduling of your job, choose the **Cron schedule** option and use cron syntax. With this syntax, you can specify the minute, hour, day of the month, month, and day of the week, allowing you to set up complex schedules like running a job on the first Monday of each month.

Use tools such as [crontab.guru](https://crontab.guru/) to generate the correct cron syntax. This tool allows you to input cron snippets and returns their plain English translations.
**Cron frequency**

Here are examples of cron job schedules. The dbt Cloud job scheduler supports using `L` to schedule jobs on the last day of the month:
To enhance performance, job scheduling frequencies vary by dbt Cloud plan:

- Developer plans: dbt Cloud sets a minimum interval of every 10 minutes for scheduling jobs. This means scheduling jobs to run more frequently, or at less than 10 minute intervals, is not supported.
- Team and Enterprise plans: No restrictions on job execution frequency.

**Examples**

Use tools such as [crontab.guru](https://crontab.guru/) to generate the correct cron syntax. This tool allows you to input cron snippets and return their plain English translations. The dbt Cloud job scheduler supports using `L` to schedule jobs on the last day of the month.

Examples of cron job schedules:

- `0 * * * *`: Every hour, at minute 0.
- `*/5 * * * *`: Every 5 minutes.
- `*/5 * * * *`: Every 5 minutes. (Not available on Developer plans)
- `5 4 * * *`: At exactly 4:05 AM UTC.
- `30 */4 * * *`: At minute 30 past every 4th hour (such as 4:30 AM, 8:30 AM, 12:30 PM, and so on, all UTC).
- `0 0 */2 * *`: At 12:00 AM (midnight) UTC every other day.
Expand Down
4 changes: 2 additions & 2 deletions website/docs/docs/use-dbt-semantic-layer/exports.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ When you run a build job, any saved queries downstream of the dbt models in that

</VersionBlock>

2. After dbt finishes building the models, the MetricFlow Server processes the exports, compiles the necessary SQL, and executes this SQL against your data platform.
2. After dbt finishes building the models, the MetricFlow Server processes the exports, compiles the necessary SQL, and executes this SQL against your data platform. It directly executes a "create table" statement so the data stays within your data platform.
3. Review the exports' execution details in the jobs logs and confirm the export was run successfully. This helps troubleshoot and to ensure accuracy. Since saved queries are integrated into the dbt DAG, all outputs related to exports are available in the job logs.
4. Your data is now available in the data platform for querying.

Expand Down Expand Up @@ -240,7 +240,7 @@ You can use exports to create a custom integration with tools such as PowerBI, a

<detailsToggle alt_header="How can I select saved_queries by their resource type?">

To select `saved_queries` by resource type, run `dbt build --resource-type saved_queries`.
To select `saved_queries` by resource type, run `dbt build --resource-type saved_query`.

</detailsToggle>

Expand Down
25 changes: 12 additions & 13 deletions website/docs/reference/dbt-jinja-functions/execute.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,17 @@ Any Jinja that relies on a result being returned from the database will error du
<File name='models/order_payment_methods.sql'>

```sql
{% set payment_method_query %}
select distinct
payment_method
from {{ ref('raw_payments') }}
order by 1
{% endset %}

{% set results = run_query(payment_method_query) %}

{# Return the first column #}
{% set payment_methods = results.columns[0].values() %}
1 {% set payment_method_query %}
2 select distinct
3 payment_method
4 from {{ ref('raw_payments') }}
5 order by 1
6 {% endset %}
7
8 {% set results = run_query(payment_method_query) %}
9
10 {# Return the first column #}
11 {% set payment_methods = results.columns[0].values() %}

```

Expand All @@ -40,7 +40,7 @@ Compilation Error in model order_payment_methods (models/order_payment_methods.s
'None' has no attribute 'table'
```
This is because Line #11 assumes that a <Term id="table" /> has been returned, when, during the parse phase, this query hasn't been run.
This is because line #11 in the earlier code example (`{% set payment_methods = results.columns[0].values() %}`) assumes that a <Term id="table" /> has been returned, when, during the parse phase, this query hasn't been run.

To work around this, wrap any problematic Jinja in an `{% if execute %}` statement:

Expand All @@ -55,7 +55,6 @@ order by 1
{% endset %}

{% set results = run_query(payment_method_query) %}

{% if execute %}
{# Return the first column #}
{% set payment_methods = results.columns[0].values() %}
Expand Down
24 changes: 22 additions & 2 deletions website/docs/reference/dbt-jinja-functions/return.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,12 @@ __Args__:

* `data`: The data to return to the caller

The `return` function can be used in macros to return data to the caller. The type of the data (dict, list, int, etc) will be preserved through the `return` call.
The `return` function can be used in macros to return data to the caller. The type of the data (dict, list, int, etc) will be preserved through the `return` call. You can use the `return` function in the following ways within your macros: as an expression or as a statement.

- Expression &mdash; Use an expression when the goal is to output a string from the macro.
- Statement with a `do` tag &mdash; Use a statement with a `do` tag to execute the return function without generating an output string. This is particularly useful when you want to perform actions without necessarily inserting their results directly into the template.

In the following example, `{{ return([1,2,3]) }}` acts as an _expression_ that directly outputs a string, making it suitable for directly inserting returned values into SQL code.

<File name='macros/get_data.sql'>

Expand All @@ -23,6 +28,21 @@ The `return` function can be used in macros to return data to the caller. The ty

</File>

Alternatively, you can use a statement with a [do](https://jinja.palletsprojects.com/en/3.0.x/extensions/#expression-statement) tag (or expression-statements) to execute the return function without generating an output string.

In the following example ,`{% do return([1,2,3]) %}` acts as a _statement_ that executes the return action but does not output a string:

<File name='macros/get_data.sql'>

```sql
{% macro get_data() %}

{% do return([1,2,3]) %}

{% endmacro %}
```

</File>


<File name='models/my_model.sql'>
Expand All @@ -33,7 +53,7 @@ select
-- getdata() returns a list!
{% for i in get_data() %}
{{ i }}
{% if not loop.last %},{% endif %}
{%- if not loop.last %},{% endif -%}
{% endfor %}
```

Expand Down
Loading

0 comments on commit 39a6163

Please sign in to comment.