Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.
This repository was archived by the owner on May 17, 2024. It is now read-only.

Add option to define level of precision for a numeric column #649

Closed
@daronjp

Description

@daronjp

When migrating a table or making a minor update to a dbt model, we like to confirm the the dev and prod tables are identical. Oftentimes, especially when a table contains floats, two numbers that appear to be exact matches to the eye are evaluated by the DB (Snowflake) as not matching. This can be frustrating, as it requires additional work to prove they do indeed match.

Describe the solution you'd like
I would like the ability to specify a level of precision for a given column. Using the example above, I'd like to specify that data-diff should consider two numbers that are within 0.001 (or any other value the user would like) of each other to be a match.

Describe alternatives you've considered
My current workaround outside of data-diff looks like this:

-- all records in table_a not in table_b
WITH diffs AS (
    SELECT
        pk, numeric_col
    FROM table_a

    EXCEPT

    SELECT
        pk, numeric_col
    FROM table_b
)

-- ideally returns 0 rows, showing that all records in table_a are in table_b
SELECT
    b.pk, b.numeric_col, d.numeric_col
FROM table_b AS b
JOIN diffs AS d ON d.pk = b.pk
WHERE
    ABS(b.numeric_col - d.numeric_col) > 0.001

Additional context
I would like to be able to specify this either in the dbt models.yml meta values, or when running a joindiff/hashdiff.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions