Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ingest/powerbi): fix subTypes and add workspace_type_filter #11523

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
e3398b0
WIP
sid-acryl Sep 30, 2024
a9cace0
WIP
sid-acryl Sep 30, 2024
63145aa
WIP
sid-acryl Sep 30, 2024
ced6e54
added app to dashboard/report
sid-acryl Sep 30, 2024
e8885fa
WIP
sid-acryl Oct 3, 2024
7be6d1a
Merge branch 'master' into ing-732-powerbi-connector-improvements
sid-acryl Oct 3, 2024
45f056c
app as container
sid-acryl Oct 3, 2024
811dd09
workspace type filter
sid-acryl Oct 3, 2024
d6bd5e5
workspace-type-filter
sid-acryl Oct 4, 2024
b83a1e0
remove app as conatiner
sid-acryl Oct 4, 2024
9dced8b
Merge branch 'master' into ing-732-powerbi-connector-improvements
sid-acryl Oct 4, 2024
08843b9
fix test-cases
sid-acryl Oct 4, 2024
c365fad
fix test-cases
sid-acryl Oct 4, 2024
19ca8be
test case for workspace_type_filter
sid-acryl Oct 4, 2024
8f49219
added paginated report test case
sid-acryl Oct 7, 2024
6bdd0fd
updated doc
sid-acryl Oct 7, 2024
eda214a
Merge branch 'master' into ing-732-powerbi-connector-improvements
sid-acryl Oct 7, 2024
347a0a7
updated doc
sid-acryl Oct 7, 2024
3fa0730
Merge branch 'ing-732-powerbi-connector-improvements' of https://gith…
sid-acryl Oct 7, 2024
658447d
Update metadata-ingestion/src/datahub/ingestion/source/powerbi/config.py
sid-acryl Oct 8, 2024
7b46a88
Merge branch 'master' into ing-732-powerbi-connector-improvements
sid-acryl Oct 8, 2024
f2c7d84
addressed review comments
sid-acryl Oct 8, 2024
6b196fa
Merge branch 'ing-732-powerbi-connector-improvements' of https://gith…
sid-acryl Oct 8, 2024
5b729ee
Merge branch 'master' into ing-732-powerbi-connector-improvements
sid-acryl Oct 9, 2024
5822166
address review comment
sid-acryl Oct 9, 2024
4b73ff6
Merge branch 'master' into ing-732-powerbi-connector-improvements
sid-acryl Oct 11, 2024
15ac4b1
update in golden files
sid-acryl Oct 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion metadata-ingestion/docs/sources/powerbi/powerbi_pre.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,11 @@
| `Report.webUrl` | `Chart.externalUrl` |
| `Workspace` | `Container` |
| `Report` | `Dashboard` |
| `PaginatedReport` | `Dashboard` |
| `Page` | `Chart` |

If Tile is created from report then Chart.externalUrl is set to Report.webUrl.
- If `Tile` is created from report then `Chart.externalUrl` is set to Report.webUrl.
- The `Page` is unavailable for PowerBI PaginatedReport.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still confusing - let's add PaginatedReport to the concept mapping table which will clean things up

also, how is Report.webUrl mapped to a chart if the report is mapped to a dashboard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tile contains visualizations that can be created using a dataset, a text box, or by pinning a report. If a report is pinned within the tile, the externalUrl will point to that report.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see


## Lineage

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ class DatasetSubTypes(StrEnum):
ELASTIC_DATASTREAM = "Datastream"
SALESFORCE_CUSTOM_OBJECT = "Custom Object"
SALESFORCE_STANDARD_OBJECT = "Object"
POWERBI_DATASET_TABLE = "PowerBI Dataset Table"
QLIK_DATASET = "Qlik Dataset"
BIGQUERY_TABLE_SNAPSHOT = "Bigquery Table Snapshot"
SHARDED_TABLE = "Sharded Table"
Expand Down Expand Up @@ -48,8 +47,8 @@ class BIContainerSubTypes(StrEnum):
LOOKML_PROJECT = "LookML Project"
LOOKML_MODEL = "LookML Model"
TABLEAU_WORKBOOK = "Workbook"
POWERBI_WORKSPACE = "Workspace"
POWERBI_DATASET = "PowerBI Dataset"
POWERBI_DATASET = "Semantic Model"
POWERBI_DATASET_TABLE = "Table"
QLIK_SPACE = "Qlik Space"
QLIK_APP = "Qlik App"
SIGMA_WORKSPACE = "Sigma Workspace"
Expand Down
27 changes: 24 additions & 3 deletions metadata-ingestion/src/datahub/ingestion/source/powerbi/config.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import logging
from dataclasses import dataclass, field as dataclass_field
from enum import Enum
from typing import Dict, List, Optional, Union
from typing import Dict, List, Literal, Optional, Union

import pydantic
from pydantic import validator
Expand Down Expand Up @@ -47,6 +47,7 @@ class Constant:
WORKSPACE_ID = "workspaceId"
DASHBOARD_ID = "powerbi.linkedin.com/dashboards/{}"
DATASET_EXECUTE_QUERIES = "DATASET_EXECUTE_QUERIES_POST"
GET_WORKSPACE_APP = "GET_WORKSPACE_APP"
DATASET_ID = "datasetId"
REPORT_ID = "reportId"
SCAN_ID = "ScanId"
Expand Down Expand Up @@ -118,6 +119,15 @@ class Constant:
CHART_COUNT = "chartCount"
WORKSPACE_NAME = "workspaceName"
DATASET_WEB_URL = "datasetWebUrl"
TYPE = "type"
REPORT_TYPE = "reportType"
LAST_UPDATE = "lastUpdate"
APP_ID = "appId"
REPORTS = "reports"
ORIGINAL_REPORT_OBJECT_ID = "originalReportObjectId"
APP_SUB_TYPE = "App"
STATE = "state"
ACTIVE = "Active"


@dataclass
Expand Down Expand Up @@ -273,7 +283,8 @@ class PowerBiDashboardSourceConfig(
# PowerBi workspace identifier
workspace_id_pattern: AllowDenyPattern = pydantic.Field(
default=AllowDenyPattern.allow_all(),
description="Regex patterns to filter PowerBI workspaces in ingestion",
description="Regex patterns to filter PowerBI workspaces in ingestion."
" Note: This field works in conjunction with 'workspace_type_filter' and both must be considered when filtering workspaces.",
)

# Dataset type mapping PowerBI support many type of data-sources. Here user need to define what type of PowerBI
Expand Down Expand Up @@ -340,7 +351,7 @@ class PowerBiDashboardSourceConfig(
)
modified_since: Optional[str] = pydantic.Field(
default=None,
description="Get only recently modified workspaces based on modified_since datetime '2023-02-10T00:00:00.0000000Z', excludePersonalWorkspaces and excludeInActiveWorkspaces limit to last 30 days",
description="Get only recently modified workspaces based on modified_since datetime '2023-02-10T00:00:00.0000000Z', excludeInActiveWorkspaces limit to last 30 days",
)
extract_dashboards: bool = pydantic.Field(
default=True,
Expand Down Expand Up @@ -445,6 +456,16 @@ class PowerBiDashboardSourceConfig(
description="Patch dashboard metadata",
)

workspace_type_filter: List[
Literal[
"Workspace", "PersonalGroup", "Personal", "AdminWorkspace", "AdminInsights"
]
] = pydantic.Field(
default=["Workspace"],
description="Ingest the metadata of the workspace where the workspace type corresponds to the specified workspace_type_filter."
" Note: This field works in conjunction with 'workspace_id_pattern'. Both must be matched for a workspace to be processed.",
)

@root_validator(skip_on_failure=True)
def validate_extract_column_level_lineage(cls, values: Dict) -> Dict:
flags = [
Expand Down
Loading
Loading