Skip to content
This repository has been archived by the owner on May 17, 2024. It is now read-only.

Key compare fails between VARCHAR2(40) (Oracle) and STRING (BigQuery) datatypes #896

Closed
mhe2024 opened this issue May 15, 2024 · 1 comment
Labels
bug Something isn't working triage

Comments

@mhe2024
Copy link

mhe2024 commented May 15, 2024

Trying to do a cross-database compare between Oracle and BigQuery.
The key column in Oracle is of type VARCHAR2(40). In BigQuery it is STRING.
The keys are mostly UUID, but also some test records with integers as keys.
In the log I see the following entries:

DEFAULT 2024-05-15T11:39:57.627042Z 2024-05-15 11:39:57,628 - Got the following input request: {'a': {'system': 'data-platform', 'schema': 'access_internal_v3', 'table': 'dt_alg_diagnose_grp', 'pks': ['alg_diagnose_grp_id']}, 'b': {'system': 'dwh', 'schema': 'DM_MIG', 'table': 'DT_ALG_DIAGNOSE_GRP', 'pks': ['ALG_DIAGNOSE_GRP_ID']}}
DEFAULT 2024-05-15T11:39:58.847835Z 2024-05-15 11:39:58,849 - Mixed UUID/Non-UUID values detected in column access_internal_v3.dt_alg_diagnose_grp.alg_diagnose_grp_id, disabling UUID support.
DEFAULT 2024-05-15T11:39:58.848034Z 2024-05-15 11:39:58,849 - [BigQuery] Schema = {'alg_diagnose_grp_id': String_VaryingAlphanum(_notes=[], collation=None)}
DEFAULT 2024-05-15T11:39:58.938690Z 2024-05-15 11:39:58,940 - [Oracle] Schema = {'ALG_DIAGNOSE_GRP_ID': String_UUID(_notes=[], collation=None, lowercase=True, uppercase=False)}
DEFAULT 2024-05-15T11:39:58.940064Z 2024-05-15 11:39:58,940 - Exception on / [POST]
[...]
DEFAULT 2024-05-15T11:40:00.015535Z data_diff.errors.DataDiffMismatchingKeyTypesError: Key columns alg_diagnose_grp_id and ALG_DIAGNOSE_GRP_ID can't be compared due to different types.

Is it possible to override the Oracle schema in data-diff?
We are using the latest version 0.11.1

@mhe2024 mhe2024 added the bug Something isn't working label May 15, 2024
@glebmezh
Copy link
Contributor

Hi @mhe2024,

Thank you for trying out data-diff and for taking the time to open this issue. We made a hard decision to sunset the data-diff package and won't provide further development or support. Diffing functionality will continue to be available in Datafold Cloud. We have completely rewritten the diffing engine in the cloud over the past few months and have solved the fundamental issues with the original algorithm used in the data-diff package. Feel free to take it for a trial or contact us at [email protected] if you have any questions.

-Gleb

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants