-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for delta lake table_changes table valued function as DBT source #12512
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
❌ Your patch check has failed because the patch coverage (87.50%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.
... and 72 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with table-valued functions in delta lake. Could you (1) link to the docs on it and (2) add some unit tests for this code
Overall - modifying the from_sqlglot_table
method feels pretty hacky and intuitively doesn't feel like the right place to make these changes
Thanks for the review @hsheth2, As for how to make this change, I actually thought about it for a while and did not find a good place. The scenario is we are using delta lake table for storage and registered tables in Hive metastore, then doing ETL in DBT SQL code. To support this use case, we either make code change in sqlglot so that in the Node, TVF style table source is defined correctly, or make change when we read table source from sqlglot Node, that's what I am doing now. I am quite new to Datahub code and can definitely use some help, if you have any suggestion to better support this use case I am all ears. I 'll add UT soon. |
1 similar comment
Thanks for the review @hsheth2, As for how to make this change, I actually thought about it for a while and did not find a good place. The scenario is we are using delta lake table for storage and registered tables in Hive metastore, then doing ETL in DBT SQL code. To support this use case, we either make code change in sqlglot so that in the Node, TVF style table source is defined correctly, or make change when we read table source from sqlglot Node, that's what I am doing now. I am quite new to Datahub code and can definitely use some help, if you have any suggestion to better support this use case I am all ears. I 'll add UT soon. |
Checklist