Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filter examples to metrics #915

Merged
merged 3 commits into from
Jan 14, 2022
Merged

Add filter examples to metrics #915

merged 3 commits into from
Jan 14, 2022

Conversation

joellabes
Copy link
Contributor

Description & motivation

This metrics documentation is largely a copy-paste from the parent GH spec, and said "see below" for filter examples but there was no below. Now there is.

@drewbanin a couple of questions where the implementation diverged from the spec:

  • Is operator always required, or is it assumed to be "=" if not specified?
  • for is_paying is true, is "is" required as an operator, and would it need to be " is "? (The latter in particular would be icky)
  • 1.0.0-rc1 rejected true without quotes because True is not of type 'string'. Likewise for ltv >= 100. Is that overly aggressive type expectations? (cc @jtcohen6 - I can open an issue on the core repo if so)

Pre-release docs

Is this change related to an unreleased version of dbt?

  • Yes: please
    • update the base branch to next
    • add Changelog components: <Changelog>[New/Changed] in v0.x.0</Changelog>
    • add links to the "New and changed documentation" section of the latest Migration Guide
  • No: please ensure the base branch is current
  • Unsure: we'll let you know!

@jtcohen6
Copy link
Collaborator

jtcohen6 commented Nov 17, 2021

@joellabes Thanks for playing around with metrics, and for the great feedback!

This metrics documentation is largely a copy-paste from the parent GH spec, and said "see below" for filter examples but there was no below.

You caught me :)

Quick answers from my perspective:

Is operator always required, or is it assumed to be "=" if not specified?

There's no default defined within dbt, and this is a required property, so it will raise an error if not provided. A downstream tool or macro which leverages this metric definition could assume a default operator when not provided, but I could also imagine a filter like this, given that (on many databases) where is_bool_col is valid shorthand for where is_bool_col is true:

      filters:
        - field: is_bool_col
          operator: ""
          value: ""

for is_paying is true, is "is" required as an operator, and would it need to be " is "? (The latter in particular would be icky)

I think the spacing would be the responsibility of the downstream tool / macro. Hopefully they'd be kind enough to template the query with lenient whitespace:

where
{% for filter in filters %}
    {{ filter.field }} {{ filter.operator }} {{ filter.value }}
    {{ "and" if not loop.last }}
{% endfor %}

1.0.0-rc1 rejected true without quotes because True is not of type 'string'. Likewise for ltv >= 100. Is that overly aggressive type expectations?

Yes, that's an open TODO right now. I think we encountered validation issues with making this accept absolutely Anything, but I think we could change it to accept any reasonable data type:

    value: Union[str, int, float, bool]  # others i'm not thinking of right now

It's definitely workable to just use strings ('true', '100') in the meantime, but I think this would be an uncontroversial improvement. Mind opening an issue for it? (Note to self: this would constitute an artifact schema change, and require us to update manifest v4 at schemas.getdbt.com.)

@joellabes
Copy link
Contributor Author

There's no default defined within dbt, and this is a required property, so it will raise an error if not provided

Lovely! I'll update these docs accordingly.

It's definitely workable to just use strings ('true', '100') in the meantime, but I think this would be an uncontroversial improvement

Yeah it'd be good to open it up ASAP, because otherwise we'll need to document it as "put everything in quotes, and your strings in "'double quotes'" so that one set of quotes makes it through to Jinja", right?

Mind opening an issue for it?

dbt-labs/dbt-core#4294

Copy link
Collaborator

@runleonarun runleonarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few tiny comments/questions.

@@ -86,6 +86,19 @@ metrics:
| filters | A list of filters to apply before calculating the metric | See below | no |
| meta | Arbitrary key/value store | {team: Finance} | no |

### Filters
Filters should be defined as a list of dictionaries that define predicates for the metric. Filters are ANDed together. If more complex filtering is required, users can (and should) push that logic down into the underlying model.
Copy link
Collaborator

@runleonarun runleonarun Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @joellabes just some fly by questions here!

Filters are ANDed together

Does this mean that when you provide multiple filters they will use AND logic, meaning all criteria must be met? It might help newer users and users who aren't native English speakers to expand this sentence just a bit! (Although I do appreciate how efficient this sentence is. ❤️ )

...pushing that logic down into the underlying model.

I'm also curious about this phrase. Do we talk more anywhere on how to do this? Can we link people to this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that when you provide multiple filters they will use AND logic

Yep! happy to have it changed to anything else - none of this is my writing so I'm even less precious about it than normal 😉

I'm also curious about this phrase
Basically, instead of trying to implement more complex filters in the metric definition, we're expecting people to do their complex filtering inside of the model (in this example, ref('dim_customers')). I don't know if there's prior art we can link people out to :(

@netlify
Copy link

netlify bot commented Dec 4, 2021

✔️ Deploy Preview for docs-getdbt-com ready!

🔨 Explore the source changes: 5c5d3ce

🔍 Inspect the deploy log: https://app.netlify.com/sites/docs-getdbt-com/deploys/61e0c1e4db5ce1000745d640

😎 Browse the preview: https://deploy-preview-915--docs-getdbt-com.netlify.app

@joellabes
Copy link
Contributor Author

@runleonarun could you have a peek at this too please?

@joellabes joellabes changed the base branch from next to current January 14, 2022 00:15
@github-actions github-actions bot added the size: medium This change will take up to a week to address label Jan 14, 2022
Copy link
Collaborator

@runleonarun runleonarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@joellabes joellabes merged commit 224c100 into current Jan 14, 2022
@joellabes joellabes deleted the joellabes-patch-1 branch January 14, 2022 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: medium This change will take up to a week to address
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants