Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quotes around Identifiers/Timestamps in compiled SQL for Snowflake #15

Closed
shubh24 opened this issue Feb 25, 2022 · 2 comments
Closed

Quotes around Identifiers/Timestamps in compiled SQL for Snowflake #15

shubh24 opened this issue Feb 25, 2022 · 2 comments

Comments

@shubh24
Copy link

shubh24 commented Feb 25, 2022

We’ve started using metrics_store for experiments and are using Snowflake as the connected data warehouse.

Our metric definition looks like the following, here we are counting the number of events ingested from Segment or Amplitude per day/week/month.

metrics:
  - name: event_count
    label: Event Counts
    model: ref('activity_stream')
    description: "Count of events logged"

    type: count
    sql: activity

    timestamp: timestamp
    time_grains: [day, week, month]

    dimensions:
      - source

    filters:
      - field: timestamp
        operator: '>='
        value: '2021-01-01'

The model used for calculation of the metrics is as follows(borrowed from the example shown in the Readme)

{{ config(materialized = 'incremental') }}

select *
from {{ metrics.metric(
    metric_name='event_count',
    grain='week',
    dimensions=['source'],
    secondary_calculations=[
        metrics.period_over_period(comparison_strategy="ratio", interval=1, alias="pop_1wk")
    ]
) }}

However, the compiled SQL has the following issues:

  • There are a few column names that need to be wrapped in double quotes for Snowflake : source, activity and timestamp — if not, they show up as invalid identifiers with the following error : invalid identifier 'TIMESTAMP'
  • The timestamp filter mentioned in the metric is showing up as a filter where 1=1 and timestamp >= 2021-01-01 in the compiled SQL. This is missing quotes around timestamp and the date 2021-01-01

Once we copied over the SQL and updated the quotes around identifiers/dates, the metric creation completed successfully!

@shubh24 shubh24 changed the title Quotations around Identifiers/Timestamps in compiled SQL for Snowflake Quotes around Identifiers/Timestamps in compiled SQL for Snowflake Feb 25, 2022
@joellabes
Copy link
Contributor

@shubh24 at the moment, you'll need to double-quote dates. From the docs:

Note that value must be defined as a string in YAML, because it will be compiled into queries as part of a string. If your filter's value needs to be surrounded in quotes inside the query, use "'nested'" quotes

metrics:
  - name: event_count
    [...] 
    filters:
      - field: timestamp
        operator: '>='
        value: "'2021-01-01'"

will see you right!

I'm going to close this because it's able to be handled by an extra set of quotes, but wanted to let you know that we also have dbt-labs/dbt-core#4294 to track similar ergonomic issues.

@shubh24
Copy link
Author

shubh24 commented Mar 8, 2022

Thanks for the reply here @joellabes - doublequoting the datetime string definitely solves the issue here 👍

As for the issue we were facing around quotes around column identifiers, we resolved it by doing the following:

quoting:
  identifier: true

We removed quotes around identifiers referenced in our dbt project, and this allowed dbt to identify the columns correctly 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants