[D1] - Extract and Utilize System Data #723

ydaponte · 2024-09-13T08:27:53Z

As a Developer, I want to gather and utilize data from System Views in Microsoft Fabric to monitor system performance and query activity so that we can effectively score and tag stale content.

Resources

DMVs: https://learn.microsoft.com/en-us/fabric/data-warehouse/monitor-using-dmv
Example for utilization using DMV: https://learn.microsoft.com/en-us/fabric/data-warehouse/how-to-observe-utilization
Delta lake logs: Delta Lake logs in Warehouse - Microsoft Fabric | Microsoft Learn
monitor overview: https://learn.microsoft.com/en-us/fabric/data-warehouse/monitoring-overview

Queries:

Acceptance Criteria:

Implement data extraction from system views, DMVs, and Delta Lake logs.
Integrate this data into the consolidated denorm table.
Ensure that data from queries and insights are utilized for content scoring.
Potential Tech Stack to use: DMVs/Delta Logs exist under the Warehouse in Fabric, Delta logs are stored in OneLake. For Monitoring, query insights, purview and log analytics will be useful.

andreiafig-msft · 2024-09-16T20:40:48Z

STUDYING OUR DATA SOURCE - SYSTEM VIEWS IN DATAWAREHOUSE

What are we considering as a stale content in terms of Data warehouse query and performance? The system views will provide information about:

Query Performance Analysis

What is the historical performance of our queries?
Are there any long-running queries that need attention?
Can we identify the queries causing performance bottlenecks?

Query Optimization and Tuning

Which queries are frequently run, and can their performance be improved?
Can we identify queries that have failed or been canceled?
Can we track changes in query performance over time?
Are there any queries that consistently perform poorly?

User Activity Monitoring

Who submitted a particular query?
Who are the most active users or the users with the most long-running queries?

With the above information, customer will benefit from creating reports for analysis and statistics but not for archival.

In the Fabric Warehouse, we have the Query Insights that is automatically created. Query insights - Microsoft Fabric | Microsoft Learn

Comparing the feature Query Insights with the system views, there are some similarities (or one is complementary to the other). From what I could understand, the Query Insights is using the DMV's and returns a more complete information. The same if happening if we consider the Query Activity.

Query insights - Microsoft Fabric | Microsoft Learn
Monitor connections, sessions, and requests using DMVs - Microsoft Fabric | Microsoft Learn
Monitor your running and completed queries using Query activity - Microsoft Fabric | Microsoft Learn

The fact that you are able to find the same information in these different tools is highlighted in the Query Activity documentation.

As per my experience, the customer will always want to have access to this data for monitoring and tracking purposed. There is no benefit in archiving this data specially because we only save the data for the past 30 days (this is for the Query Insights, Query Activity).

For the DMV's, I am not sure about the limits but I would say it is same as the ones in Synapse. Please refer to the following documentation:
System views supported in Synapse SQL - Azure Synapse Analytics | Microsoft Learn
Capacity limits for dedicated SQL pool - Azure Synapse Analytics | Microsoft Learn

SUMMARY:
• We need to evaluate what we would consider as a stale content for the Data Warehouse system views and query execution.
• From my perspective, there is no need to identify any content here as stale or with the need of archival because this will be done automatically (the telemetry will expire after 30 days and will have a limit for 10.000 records, for example).

NEXT STEPS: Try to correlate the metadata information with the query text to verify if a table has not been used. (For example) - This will give us the information about idle resources.

ydaponte changed the title ~~1. Extract and Utilize System Data~~ 1 - [US] - Extract and Utilize System Data Sep 13, 2024

ydaponte assigned ydaponte and andreiafig-msft and unassigned ydaponte Sep 13, 2024

ydaponte changed the title ~~1 - [US] - Extract and Utilize System Data~~ [D1] - Extract and Utilize System Data Sep 13, 2024

devlace added the utils: fabric label Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[D1] - Extract and Utilize System Data #723

[D1] - Extract and Utilize System Data #723

ydaponte commented Sep 13, 2024 •

edited by raafatzarka

Loading

andreiafig-msft commented Sep 16, 2024 •

edited

Loading

[D1] - Extract and Utilize System Data #723

[D1] - Extract and Utilize System Data #723

Comments

ydaponte commented Sep 13, 2024 • edited by raafatzarka Loading

andreiafig-msft commented Sep 16, 2024 • edited Loading

ydaponte commented Sep 13, 2024 •

edited by raafatzarka

Loading

andreiafig-msft commented Sep 16, 2024 •

edited

Loading