Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for deprecated data types text and ntext #46

Open
aaronsteers opened this issue Jan 29, 2021 · 0 comments
Open

Support for deprecated data types text and ntext #46

aaronsteers opened this issue Jan 29, 2021 · 0 comments

Comments

@aaronsteers
Copy link

aaronsteers commented Jan 29, 2021

Background:

The tap currently does not support the data types text and ntext. Although these are deprecated data types as noted here, there are still many environments with instances of these column types in the wild.

Our organization has some such columns (type of text) and the resulting generated catalog is omitting the column's type declaration. Since the downstream target (snowflake in our case) is expecting a data type, this causes a hard failure trying to parsing a dict with unknown key.

Proposal:

The proposed solution would handle text as varchar(max) and ntext as nvarchar(max).

Code location:

"varchar" {"type" ["string"]

Alternative considered:

An alternative or additional solution would be to fallback to string data types as an else clause in the column type mapping, which would ensure that downstream targets always receive a valid catalog file with some kind of type declaration for each column, rather than having an omitted type field. Since basically all types can be represented as strings in some form, this seems like a good fallback rather than generating a catalog with missing types.

A benefit to this else condition alternative is that there are sometimes other reasons why columns cannot declare their metadata. In rare scenarios we've seen MS SQL Server return a null/missing data type for views that can/t or don't properly inspect their own metdata, for instance in this hypothetical sample scenario:

select null as col_a from table_one
union all
select 123 as col_a from table_two
union all
select '234' as col_b from table_three

The first instance in the union normally declares type but it is null so it may default to int or unknown. The third instance in the union being a string will probably succeed at implicit conversion to integer type, but again, this might not be able to be resolved - or might be resolved inconsistently - at view creation time.

Related article regarding text and ntext data types:

https://docs.microsoft.com/en-us/sql/t-sql/data-types/ntext-text-and-image-transact-sql?view=sql-server-ver15

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant