Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Delta Live Tables #213

Open
gerson23 opened this issue Jul 27, 2023 · 1 comment
Open

Support for Delta Live Tables #213

gerson23 opened this issue Jul 27, 2023 · 1 comment
Labels
enhancement New feature or request help wanted Extra attention is needed openlineage Requires support from OpenLineage to address

Comments

@gerson23
Copy link

Describe the feature
In our project, we are starting to use Delta Live Tables and we need to publish the lineage information to Purview. However, the solution accelerator doesn't seem to support DLT pipelines yet.

Detailed Example
Ideally, we should be able to see the DLT pipeline lineage in Purview, connecting the assets DLT is reading from up to the assets created as an output of the pipeline.

flowchart LR;
  raw["raw asset"];
  bronze_table["bronze table"];
  silver_table["silver table"];
  gold_table["gold table"];
  asset["aggregated asset"];
  raw --> bronze_table;
  subgraph DLT Pipeline
  bronze_table --> silver_table;
  silver_table --> gold_table;
  end
  gold_table --> asset;
Loading

Another simpler option would be to have lineage information similar to notebooks, hiding the internals of the pipeline.

flowchart LR
  raw["raw asset"]
  dlt["DLT Pipeline"]
  asset["aggregate asset"]
  raw --> dlt
  dlt --> asset
Loading

Issues that this feature solves
N/A

Suggested Implementation
Not sure how to implement this. I know that, if going this way, the pipeline information itself is available in the event logs: https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/observability#--query-lineage-information-from-the-event-log

Additional context
N/A

@gerson23 gerson23 added the enhancement New feature or request label Jul 27, 2023
@wjohnson wjohnson added the help wanted Extra attention is needed label Dec 30, 2023
@wjohnson
Copy link
Collaborator

@gerson23 thank you so much for this great suggestion! I would love to be able to cover Delta Live Tables but we need OpenLineage to support it. In OpenLineage/OpenLineage#372 there is a desire to cover Spark streaming jobs but I know the OpenLineage community needs more support in this area.

@wjohnson wjohnson added the openlineage Requires support from OpenLineage to address label Dec 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed openlineage Requires support from OpenLineage to address
Projects
None yet
Development

No branches or pull requests

2 participants