Skip to content

Commit

Permalink
Making Semantic Layer QS and getting started guides consistent (#3858)
Browse files Browse the repository at this point in the history
## What are you changing in this pull request and why?

As requested by @Jstein77, making certain sections of guides consistent.

* added reusables for sections that are the same. 
* Copied content from the getting started guide to the reusables.
* Put reusables in 1.6 section only

cc @mirnawong1 

## Checklist
<!--
Uncomment if you're publishing docs for a prerelease version of dbt
(delete if not applicable):
- [ ] Add versioning components, as described in [Versioning
Docs](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-entire-pages)
- [ ] Add a note to the prerelease version [Migration
Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/guides/migration/versions)
-->
- [ ] Review the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
and [About
versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
so my content adheres to these guidelines.
- [ ] Add a checklist item for anything that needs to happen before this
PR is merged, such as "needs technical review" or "change base branch."

Adding new pages (delete if not applicable):
- [ ] Add page to `website/sidebars.js`
- [ ] Provide a unique filename for the new page

Removing or renaming existing pages (delete if not applicable):
- [ ] Remove page from `website/sidebars.js`
- [ ] Add an entry `website/static/_redirects`
- [ ] [Ran link
testing](https://github.com/dbt-labs/docs.getdbt.com#running-the-cypress-tests-locally)
to update the links that point to the deleted page
  • Loading branch information
runleonarun authored Aug 4, 2023
1 parent 8c38c34 commit 2ff387a
Show file tree
Hide file tree
Showing 7 changed files with 219 additions and 280 deletions.
205 changes: 11 additions & 194 deletions website/docs/docs/build/sl-getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,12 @@ meta:
api_name: dbt Semantic Layer API
---

import InstallMetricFlow from '/snippets/_sl-install-metricflow.md';
import CreateModel from '/snippets/_sl-create-semanticmodel.md';
import DefineMetrics from '/snippets/_sl-define-metrics.md';
import ConfigMetric from '/snippets/_sl-configure-metricflow.md';
import TestQuery from '/snippets/_sl-test-and-query-metrics.md';

This getting started page presents a sample workflow to help you create your first metrics. It uses the [Jaffle shop example project](https://github.com/dbt-labs/jaffle-sl-template) as the project data source and is available for you to use. If you prefer, you can create semantic models and metrics for your own dbt project.

To fully experience the power of a universal dbt Semantic Layer, take the following steps:
Expand Down Expand Up @@ -38,212 +44,23 @@ New to dbt or metrics? Try our [Jaffle shop example project](https://github.com/

## Install MetricFlow

Before you begin, install the [MetricFlow CLI](/docs/build/metricflow-cli) as an extension of a dbt adapter from PyPI. The MetricFlow CLI is compatible with Python versions 3.8, 3.9, 3.10 and 3.11

Use pip install `metricflow` and your [dbt adapter](/docs/supported-data-platforms):
<InstallMetricFlow />

- Create or activate your virtual environment. `python -m venv venv` or `source your-venv/bin/activate`
- Run `pip install "dbt-metricflow[your_adapter_name]"`
* You must specify `[your_adapter_name]`. For example, run `pip install "dbt-metricflow[snowflake]"` if you use a Snowflake adapter.

## Create a semantic model

The following steps will walk you through setting up semantic models in your dbt project, which consist of [entities](/docs/build/entities), [dimensions](/docs/build/dimensions), and [measures](/docs/build/measures).

We highly recommend you read the overview of what a [semantic model](https://docs.getdbt.com/docs/build/semantic-models) is before getting started. If you're working in the [Jaffle shop example](https://github.com/dbt-labs/jaffle-sl-template), delete the `orders.yml` config or delete the .yml extension so it's ignored during parsing. **We'll be rebuilding it step by step in this example.**

If you're following the guide in your own project, pick a model that you want to build a semantic manifest from and fill in the config values accordingly.

1. Create a new yml config file for the orders model, such as `orders.yml`.

It's best practice to create semantic models in the `/models/semantic_models` directory in your project. Semantic models are nested under the `semantic_models` key. First, fill in the name and appropriate metadata, map it to a model in your dbt project, and specify model defaults. For now, `default_agg_time_dimension` is the only supported default.

```yaml
semantic_models:
#The name of the semantic model.
- name: orders
defaults:
agg_time_dimension: ordered_at
description: |
Order fact table. This table is at the order grain with one row per order.
#The name of the dbt model and schema
model: ref('orders')
```
2. Define your entities. These are the keys in your table that MetricFlow will use to join other semantic models. These are usually columns like `customer_id`, `order_id`, and so on.

```yaml
#Entities. These usually correspond to keys in the table.
entities:
- name: order_id
type: primary
- name: location
type: foreign
expr: location_id
- name: customer
type: foreign
expr: customer_id
```

3. Define your dimensions and measures. Dimensions are properties of the records in your table that are non-aggregatable. They provide categorical or time-based context to enrich metrics. Measures are the building block for creating metrics. They are numerical columns that MetricFlow aggregates to create metrics.

```yaml
#Measures. These are the aggregations on the columns in the table.
measures:
- name: order_total
description: The total revenue for each order.
agg: sum
- name: order_count
expr: 1
agg: sum
- name: tax_paid
description: The total tax paid on each order.
agg: sum
- name: customers_with_orders
description: Distinct count of customers placing orders
agg: count_distinct
expr: customer_id
- name: locations_with_orders
description: Distinct count of locations with order
expr: location_id
agg: count_distinct
- name: order_cost
description: The cost for each order item. Cost is calculated as a sum of the supply cost for each order item.
agg: sum
#Dimensions. Either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension.
dimensions:
- name: ordered_at
type: time
type_params:
time_granularity: day
- name: order_total_dim
type: categorical
expr: order_total
- name: is_food_order
type: categorical
- name: is_drink_order
type: categorical
```

Putting it all together, a complete semantic model configurations based on the order model would look like the following example:

```yaml
semantic_models:
#The name of the semantic model.
- name: orders
defaults:
agg_time_dimension: ordered_at
description: |
Order fact table. This table is at the order grain with one row per order.
#The name of the dbt model and schema
model: ref('orders')
#Entities. These usually corespond to keys in the table.
entities:
- name: order_id
type: primary
- name: location
type: foreign
expr: location_id
- name: customer
type: foreign
expr: customer_id
#Measures. These are the aggregations on the columns in the table.
measures:
- name: order_total
description: The total revenue for each order.
agg: sum
- name: order_count
expr: 1
agg: sum
- name: tax_paid
description: The total tax paid on each order.
agg: sum
- name: customers_with_orders
description: Distinct count of customers placing orders
agg: count_distinct
expr: customer_id
- name: locations_with_orders
description: Distinct count of locations with order
expr: location_id
agg: count_distinct
- name: order_cost
description: The cost for each order item. Cost is calculated as a sum of the supply cost for each order item.
agg: sum
#Dimensions. Either categorical or time. These add additional context to metrics. The typical querying pattern is Metric by Dimension.
dimensions:
- name: ordered_at
type: time
type_params:
time_granularity: day
- name: order_total_dim
type: categorical
expr: order_total
- name: is_food_order
type: categorical
- name: is_drink_order
type: categorical
```

:::tip
If you're familiar with writing SQL, you can think of dimensions as the columns you would group by and measures as the columns you would aggregate.

```sql
select
metric_time_day, -- time
country, -- categorical dimension
sum(revenue_usd) -- measure
from
snowflake.fact_transactions -- sql table
group by metric_time_day, country -- dimensions
```
:::
<CreateModel />

## Define metrics

Now that you've created your first semantic model, it's time to define your first metric! MetricFlow supports different metric types like [simple](/docs/build/simple), [ratio](/docs/build/ratio), [cumulative](/docs/build/cumulative), and [derived](/docs/build/derived). It's recommended that you read the [metrics overview docs](/docs/build/metrics-overview) before getting started.

1. You can define metrics in the same YAML files as your semantic models or create a new file. If you want to create your metrics in a new file, create another directory called `/models/metrics`. The file structure for metrics can become more complex from here if you need to further organize your metrics, for example, by data source or business line.

2. The example metric we'll create is a simple metric that refers directly to the the `order_total` measure, which will be implemented as a `sum()` function in SQL. Again, if you're working in the Jaffle shop sandbox, we recommend deleting the original `orders.yml` file, or removing the .yml extension so it's ignored during parsing. We'll be rebuilding the `order_total` metric from scratch. If you're working in your own project, create a simple metric like the one below using one of the measures you created in the previous step.

```yaml
metrics:
- name: order_total
description: Sum of total order amonunt. Includes tax + revenue.
type: simple
label: Order Total
type_params:
measure: order_total
```

3. Save your code, and in the next section, you'll validate your configs before committing them to your repository.

To continue building out your metrics based on your organization's needs, refer to the [Build your metrics](/docs/build/build-metrics-intro) for detailed info on how to define different metric types and semantic models.
<DefineMetrics />

## Configure the MetricFlow time spine model

MetricFlow requires a time spine for certain metric types and join resolution patterns, like cumulative metrics. You will have to create this model in your dbt project. [This article](/docs/build/metricflow-time-spine) explains how to add the `metricflow_time_spine` model to your project.
<ConfigMetric />

## Test and query metrics

This section will explain how you can test and query metrics locally. Before you begin, refer to [MetricFlow CLI](/docs/build/metricflow-cli) for instructions on installing it and a reference for the CLI commands.

:::tip
- dbt Cloud Team or Enterprise &mdash; For public beta, querying metrics in the dbt Cloud IDE isn't yet supported (Coming soon). You'll still be able to run semantic validation on your metrics in the IDE to ensure they are defined correctly. You can also use the MetricFlow CLI to test and query metrics locally. Alternatively, you can test using SQL client tools like DataGrip, DBeaver, or RazorSQL.

- dbt Core or Developer plan &mdash; Users can only test and query metrics manually using the CLI, but won't be able to use the dbt Semantic Layer to dynamically query metrics.
:::

**Query and commit your metrics using the CLI:**

MetricFlow needs a semantic_manifest.json in order to build a semantic graph. To generate a semantic_manifest.json artifact run `dbt parse`. This will create the file in your `/target` directory. If you're working from the Jaffle shop example, run `dbt seed && dbt run` before preceding to ensure the data exists in your warehouse.

1. Make sure you have the MetricFlow CLI installed and up to date.
2. Run `mf --help` to confirm you have MetricFlow installed and view the available commands.
3. Run `mf query --metrics <metric_name> --group-by <dimension_name>` to query the metrics and dimensions. For example, `mf query --metrics order_total --group-by metric_time`
4. Verify that the metric values are what you expect. To further understand how the metric is being generated, you can view the generated SQL if you type `--explain` in the CLI.
5. Run `mf validate-configs` to run validation on your semantic models and metrics.
6. Commit and merge the code changes that contain the metric definitions.
<TestQuery />

## Run a production job

Expand Down
98 changes: 12 additions & 86 deletions website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,13 @@ meta:
<VersionBlock firstVersion="1.6">

import NewSLChanges from '/snippets/_new-sl-changes.md';
import InstallMetricFlow from '/snippets/_sl-install-metricflow.md';
import CreateModel from '/snippets/_sl-create-semanticmodel.md';
import DefineMetrics from '/snippets/_sl-define-metrics.md';
import ConfigMetric from '/snippets/_sl-configure-metricflow.md';
import TestQuery from '/snippets/_sl-test-and-query-metrics.md';



<NewSLChanges />

Expand Down Expand Up @@ -39,100 +46,19 @@ New to dbt or metrics? Try our [Jaffle shop example project](https://github.com/

## Install MetricFlow

Before you begin, install the [MetricFlow CLI](/docs/build/metricflow-cli) as an extension of a dbt adapter from PyPI. The MetricFlow CLI is compatible with Python versions 3.8, 3.9, 3.10 and 3.11

Use pip install `metricflow` and your [dbt adapter](/docs/supported-data-platforms):

- Create or activate your virtual environment. `python -m venv venv`
- `pip install "dbt-metricflow[your_adapter_name]"`
* You must specify `[your_adapter_name]`. For example, run `pip install "dbt-metricflow[snowflake]"` if you use a Snowflake adapter.

Currently, the supported adapters are Snowflake and Postgres (BigQuery, Databricks, and Redshift coming soon).
<InstallMetricFlow />

## Create a semantic model

This step will guide you through setting up your semantic models in your dbt project, which consist of [entities](/docs/build/entities), [dimensions](/docs/build/dimensions), and [measures](/docs/build/measures).

1. Name your semantic model, fill in appropriate metadata, and map it to a model in your dbt project.

```yaml
semantic_models:
- name: transactions
description: |
This table captures every transaction starting July 02, 2014. Each row represents one transaction
model: ref('fact_transactions')
```
2. Define your entities. These are the keys in your table that MetricFlow will use to join other semantic models. These are usually columns like `customer_id`, `transaction_id`, and so on.

```yaml
entities:
- name: transaction
type: primary
expr: id_transaction
- name: customer
type: foreign
expr: id_customer
```

3. Define your dimensions and measures. dimensions are properties of the records in your table that are non-aggregatable. They provide categorical or time-based context to enrich metrics. Measures are the building block for creating metrics. They are numerical columns that MetricFlow aggregates to create metrics.

```yaml
measures:
- name: transaction_amount_usd
description: The total USD value of the transaction.
agg: sum
dimensions:
- name: is_large
type: categorical
expr: case when transaction_amount_usd >= 30 then true else false end
```

:::tip

If you're familiar with writing SQL, you can think of dimensions as the columns you would group by and measures as the columns you would aggregate.

```sql
select
metric_time_day, -- time
country, -- categorical dimension
sum(revenue_usd) -- measure
from
snowflake.fact_transactions -- sql table
group by metric_time_day, country -- dimensions
```
:::
<CreateModel />

## Define metrics

Now that you've created your first semantic model, it's time to define your first metric. MetricFlow supports different metric types like [simple](/docs/build/simple), [ratio](/docs/build/ratio), [cumulative](/docs/build/cumulative), and [derived](/docs/build/derived).

1. You can define metrics in the same YAML files as your semantic models, or create a new file.

2. The example metric we'll create is a simple metric that refers directly to a measure, based on the `transaction_amount_usd` measure, which will be implemented as a `sum()` function in SQL.

```yaml
---
metrics:
- name: transaction_amount_usd
type: simple
type_params:
measure: transaction_amount_usd
```

3. Click **Save** and then **Preview** the code in the dbt Cloud IDE.

## Test metrics

The following steps explain how to test and manually query your metrics. Currently, you can only manually test your metrics using the CLI (dbt Cloud IDE support coming soon)
<DefineMetrics />

1. Make sure you have the [MetricFlow CLI](/docs/build/metricflow-cli) installed and up to date.
2. In the CLI, run `mf validate-configs` to validate the changes before committing them.
3. Run `mf query --metrics <metric_name> --group-by <dimension_name>` to manually query the metrics and dimensions.
4. Verify that the metric values are what you expect. You can view the generated SQL if you type `--explain` in the CLI.
5. Commit and merge the code changes that contain the metric definitions.
## Test and query metrics

To continue building out your metrics based on your organization's needs, refer to the [Build your metrics](/docs/build/build-metrics-intro) for detailed info on how to define different metric types and semantic models.
<TestQuery />

## Run a production job

Expand Down
1 change: 1 addition & 0 deletions website/snippets/_sl-configure-metricflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
MetricFlow requires a time spine for certain metric types and join resolution patterns, like cumulative metrics. You will have to create this model in your dbt project. [This article](/docs/build/metricflow-time-spine) explains how to add the `metricflow_time_spine` model to your project.
Loading

0 comments on commit 2ff387a

Please sign in to comment.