diff --git a/content/en/metrics/guide/custom_metrics_governance.md b/content/en/metrics/guide/custom_metrics_governance.md index e1b8bfeb049db..9aa285393868b 100644 --- a/content/en/metrics/guide/custom_metrics_governance.md +++ b/content/en/metrics/guide/custom_metrics_governance.md @@ -19,7 +19,7 @@ further_reading: Cloud-based applications can generate massive amounts of data and large observability costs, ultimately placing pressure on organizations to reduce this budget line item. To reduce observability costs, many teams resort to collecting fewer metrics; however, for centralized SRE and observability teams, effective custom metrics governance should increase monitoring efficiency rather than cut visibility entirely. -This guide provides best practices for managing your custom metrics volumes through the three key components of effective metrics governance: **Visibility and Attribution**, **Actionable Custom Metrics Governance**, and **Monitoring and Prevention**. Learn how to use available Datadog tools to maintain cost-effective observability for these key components. You'll learn how to: +This guide provides best practices for managing your custom metrics volumes through the three key components of effective metrics governance: **Visibility and Attribution**, **Actionable Custom Metrics Governance**, and **Monitoring and Prevention**. Learn how to use available Datadog tools to maintain cost-effective observability for these key components: - [Find and understand your metrics usage and costs](#visibility-and-attribution) - [Identify your largest cost drivers](#account-level-visibility) - [Attribute your largest cost drivers to the teams or services responsible for them](#team-level-visibility-and-attribution) @@ -83,7 +83,7 @@ To identify which team or service is responsible for your top custom metric name 1. From the [Plan & Usage page][2], scroll down to the *Usage Summary* section. 1. Click the **Custom Metrics** tab to view your organization's billable usage, usage trends, and top custom metrics. -1. Under the table for *Top Custom Metrics for *, click the icon to **See in Metrics Summary** for the top custom metric. This takes you to the *Metrics Summary* page with the opened metric details side panel. +1. Under the table for *Top Custom Metrics for *, click the icon to **See in Metrics Summary** for the top custom metric. This takes you to the *Metrics Summary* page with the opened metric details side panel. 1. In the side panel, scroll down to the *Tags* section to view associated tags such as teams and service. #### View your team's custom metrics @@ -98,7 +98,7 @@ All users in your organization can see OOTB realtime estimated custom metrics us With Metrics Volume Management, you can identify your organization's largest metrics as well as the metric names spiking in volume (likely culprits of any unexpected overage). -{{< img src="metrics/guide/volume_management_page.png" alt="Metrics Volume Management page" style="width:90%;" >}} +{{< img src="metrics/guide/custom_metrics_governance/volume_management_page_2025-01-27.png" alt="Metrics Volume Management page" style="width:90%;" >}} For more information, see the [Metrics Volume Management][8] documentation. @@ -106,7 +106,7 @@ For more information, see the [Metrics Volume Management][8] documentation. Effective custom metrics governance should increase monitoring efficiency. After you understand what your usage is and attribute usage to its source, take action to reduce your metrics. -In this section, you'll learn about the actions you can take to maximize the ROI and value you get from your observability spend without sacrificing the visibility your team actively relies on. +In this section, learn about the actions you can take to maximize the ROI and value you get from your observability spend without sacrificing the visibility your team actively relies on. ### Metrics without Limits™ @@ -118,7 +118,7 @@ Reduce your indexed custom metrics volumes on any metric name by setting a tag c With Metrics without Limits™, Datadog automatically provides the following: - Up to date recommended tag configurations (based on our intelligent query insights) to help you maximize the ROI and value you get from your observability spend. -- Intelligent query insights that continuously compute and analyze all users' interactions (both in-app and through the API) on any metrics submitted to us so that your recommended tag configurations are always relevant. +- Intelligent query insights that continuously compute and analyze all users' interactions (both in-app and through the API) on any metrics submitted to Datadog so that your recommended tag configurations are always relevant. - Ability to roll back changes at any time to get full visibility into all your originally submitted data. As part of Datadog's metrics governance best practices, start by using Metrics without Limits on your [Top Custom Metrics](#identify-metrics-that-have-the-biggest-impact-on-monthly-bill). @@ -138,18 +138,17 @@ tags:audit "Queryable tag configuration" ### Reduce costs from unqueried metrics -To ensure you're not removing valuable visibility while reducing costs, you need to differentiate between the actively queried metrics that your team relies on from the metrics that aren't queried anywhere within the Datadog platform or through the API. Datadog's intelligent query insights continuously computes and analyzes all users' interactions (in-app or via API) on any metric to help identify less valuable, unused metrics. +To ensure you're not removing valuable visibility while reducing costs, you need to differentiate between the actively queried metrics that your team relies on from the metrics that aren't queried anywhere within the Datadog platform or through the API. Datadog's intelligent query insights continuously computes and analyzes all users' interactions (in Datadog or through the API) on any metric to help identify less valuable, unused metrics. -Identify your organization's entire list of unqueried metrics over the past 30 days: -1. On the [Metrics Summary page][6], find the **Query Activity (past 30 days)** facet on the left side. -2. Select **Not Actively Queried**. -3. Find the **Configuration** facet on the left side, and select **All Tags**. The combination of these two facets provides you a list of unqueried custom metrics that haven't yet been configured that you can receive immediate cost savings from. -4. Review the resulting table of metrics names. Are there any patterns or are they submitted from a specific service? Find tags associated with these unqueried metrics. -5. (Optional) To export this list, click **Export as CSV** above the metric table. +Identify your organization's entire list of unqueried metrics over the past 30, 60, or 90 days: +1. On the [Metrics Summary page][6], find the **Query Activity** facet on the left side. Select the time frame of interest (30, 60, or 90 days). +2. Find the **Configuration** facet on the left side, and select **All Tags**. The combination of these two facets provides you a list of unqueried custom metrics that haven't yet been configured that you can receive immediate cost savings from. +3. Review the resulting table of metrics names. Are there any patterns or are they submitted from a specific service? Find tags associated with these unqueried metrics. +4. (Optional) To export this list, click **Export as CSV** above the metric table. -After you identify the metrics that your developers don't need, you can safely reduce the custom metrics volumes and reduce the costs of these unused metrics with Metrics without Limits™. + After you identify the metrics that your developers don't need, you can safely reduce the custom metrics volumes and reduce the costs of these unused metrics with Metrics without Limits™. -{{< img src="metrics/guide/custom_metrics_governance/manage_tags_fm_metrics_summary.png" alt="The Configure Metrics drop menu with the Manage tags selection highlighted" style="width:90%;" >}} +{{< img src="metrics/guide/custom_metrics_governance/manage_tags_fm_metrics_summary_2025-01-27.png" alt="The Configure Metrics drop menu with the Manage tags selection highlighted" style="width:90%;" >}} 5. At the top of the [Metrics Summary page][6], click the **Configure Metrics** dropdown menu. 6. Select **Manage tags** to open the [Metrics without Limits™ Tag configuration modal][13] to configure multiple metrics in bulk. @@ -164,7 +163,7 @@ Even though a metric is not queried for the past 30 days, your teams might still Datadog's Metrics without Limits™ is a suite of features that also provide you with OOTB insights to assess the value of your actively queried metrics with [Metrics Related Assets][15]. A metrics related asset refers to any Datadog asset, such as a dashboard, notebook, monitor, or SLO that queries a particular metric. Use related asset popularity and quantity to evaluate metric utility within your organization, enabling data-driven decisions. Gain a better understanding of how your team can use existing metrics to get more value from your observability spend. -{{< img src="metrics/volume/related_assets.png" alt="Metric detail side panel showing the Related Assets section. The example metric is applied to one dashboard" style="width:100%;" >}} +{{< img src="metrics/related_assets_2025-01-27.png" alt="Metric detail side panel showing the Related Assets section. The example metric is applied to three dashboards" style="width:100%;" >}} To view a metric's related assets: 1. Click on the metric name to open its details side panel. diff --git a/content/en/metrics/summary.md b/content/en/metrics/summary.md index 9f7b8a74bd4ae..23d1bce0dd7a4 100644 --- a/content/en/metrics/summary.md +++ b/content/en/metrics/summary.md @@ -19,13 +19,12 @@ The [Metrics Summary page][1] displays a list of your metrics reported to Datado Search your metrics by metric name or tag using the **Metric** or **Tag** search fields: -{{< img src="metrics/summary/tag_advancedfiltering3.mp4" alt="The metrics summary page with NOT team:* entered in the Tag search bar" video=true style="width:75%;">}} +{{< img src="metrics/summary/tag_advanced_filtering.png" alt="The metrics summary page with NOT team:* entered in the Tag search bar" style="width:75%;">}} -Tag filtering supports boolean and wildcard syntax so that you can quickly identify: +Tag filtering supports Boolean and wildcard syntax so that you can identify: * Metrics that are tagged with a particular tag key, for example, `team`: `team:*` * Metrics that are missing a particular tag key, for example, `team`: `NOT team:*` - ## Facet panel The search bars provide the most comprehensive set of actions to filter the list of metrics. But facets can also filter your metrics by: @@ -33,13 +32,13 @@ The search bars provide the most comprehensive set of actions to filter the list - **Configuration**: Metrics with tag configurations - **Percentiles**: Distribution metrics enabled by percentiles/advanced query capabilities - **Historical Metrics**: Metrics that have historical metrics ingestion enabled -- **Query Activity** (Beta): Metrics not queried in the app or by the API in the past 30 days +- **Query Activity**: Metrics not queried in Datadog or by the API in the past 30, 60, or 90 days - **Metric Type**: Differentiate between distribution and non-distribution metrics (counts, gauges, rates) - **Metric Origin**: The product from which the metric originated (for example, metrics generated from Logs or APM Spans). To learn more about the different metric origin types, see [Metric origin definitions][12]. -**Note**: A metric included on a Dashboard that has not been loaded by a user in the last 30 days would not be considered actively queried. +**Note**: A metric included on a Dashboard that has not been loaded by a user in the last 30 days is not considered actively queried. -{{< img src="metrics/summary/facets4.png" alt="Metrics Facet Panel" style="width:75%;">}} +{{< img src="metrics/summary/facet_panel.png" alt="Metrics Facet Panel" style="width:75%;">}} ## Configuration of multiple metrics @@ -53,7 +52,7 @@ Clicking on **Configure Metrics** gives you multiple options that you can use to * **Enable or disable percentiles**: Manage percentile aggregations across multiple distribution metrics. See the [Distributions page][31] for more information. -{{< img src="metrics/summary/percentile_aggregations_toggle.png" alt="Toggle to manage percentile aggregations" style="width:100%;">}} +{{< img src="metrics/summary/percentile_aggregations_toggle_2025-01-28.png" alt="Toggle to manage percentile aggregations" style="width:100%;">}} * **Enable or disable historical metrics ingestion**: Manage the ingestion of historical metric data. See the [Historical Metrics Ingestion page][30] for more information. @@ -139,15 +138,15 @@ For any particular tag key, you can: [Learn more about tagging][5]. -## Metrics Related Assets +### Metrics Related Assets {{< img src="metrics/summary/related_assets_dashboards.png" alt="Related Assets for a specified metrics name" style="width:80%;">}} To determine the value of any metric name to your organization, use Metrics Related Assets. Metrics related assets refers to any dashboard, notebook, monitor, or SLO that queries a particular metric. -1. Scroll to the bottom of the metric's details side panel to the "Related Assets" section. -2. Click the dropdown button to view the type of related asset you are interested in (dashboards, monitors, notebooks, SLOs). You can additionally leverage the search bar to validate specific assets. - +1. Scroll to the bottom of the metric's details side panel to the **Related Assets** section. +2. Click the dropdown button to view the type of related asset you are interested in (dashboards, monitors, notebooks, SLOs). You can additionally use the search bar to validate specific assets. + ## Custom Metrics Tags Cardinality Explorer {{< callout url="https://forms.gle/H3dG9tTdR6bqzHAX9" >}} @@ -168,7 +167,6 @@ You can configure tags using the bulk metric tag configuration button or the **M 1. Click on your custom distribution metric name in the **Metrics Summary** table to open the metrics details side panel. 2. Click the **Manage Tags** button to open the tag configuration modal. - 3. Select **Include tags...** or **Exclude tags...** to customize the tags you do or don't want to query for. For more information on tag configuration, see the [Metrics without Limits][10] documentation. 4. Preview the effects of your proposed tag configuration with the cardinality estimator before selecting **Save**. diff --git a/content/en/metrics/volume.md b/content/en/metrics/volume.md index 76f54b1a37abb..c87f710368aad 100644 --- a/content/en/metrics/volume.md +++ b/content/en/metrics/volume.md @@ -16,7 +16,7 @@ further_reading: ## Overview -{{< img src="metrics/volume/metrics_volume_overview.png" alt="Metrics Volume page set to a timeframe of the past hour (by default) showing the search, filter, facet and column sorting features available" style="width:100%;" >}} +{{< img src="metrics/volume/metrics_volume_overview_2025-01-27.png" alt="Metrics Volume page set to a timeframe of the past hour (by default) showing the search, filter, facet, and column sorting features available" style="width:100%;" >}} Cloud-based applications generate massive amounts of data, which can be overwhelming for your organization as it scales. Observability costs become a significant budget item but core observability teams lack visibility into what is truly valuable to each individual engineering team. Individual teams are less incentivized to be proactive in helping manage this growth because they have limited insights into the costs of the metrics and tags they're submitting. @@ -32,19 +32,18 @@ With the Metrics Volume Management page you can access the following in real-tim - Which metrics are actually valuable (or not) to my organization? ## Real-time visibility and monitoring on your organization's Custom Metrics usage -Datadog provides you real-time _estimated_ usage metrics OOTB so you can understand and alert on your usage in real-time. You can quickly see a breakdown of: +Datadog provides you real-time _estimated_ usage metrics OOTB so you can understand and alert on your usage in real-time. You can see a breakdown of: - Your account's indexed custom metrics volume in real-time (and how much of that indexed volume hasn't been optimized with [Metrics without Limits™][3] yet) - Your account's ingested custom metrics (emitted from metrics that have been configured with [Metrics without Limits™][3]) in real-time {{< img src="metrics/volume/volume_graph.png" alt="Estimated real-time indexed and ingested Custom Metrics volume. Upon clicking export, you can easily create a monitor or export the graph to a notebook to share." style="width:100%;" >}} - ## Search, filter, and sort Use the search, filter, and sort features to understand: - Which team owns what metric names? - Which metric names your team should focus on optimizing? -- Which metrics have the highest cardinality, and which metric names are spiking(aka have the highest increase in volume)? +- Which metrics have the highest cardinality, and which metric names are spiking (have the highest increase in volume)? The Metric and Tag search bars provide a set of actions to filter the list of metrics. Enter keywords to search metric names. Type in any tag key value pair in the *Filter by Tag Value* box to filter the list by a specific team, application, or service. @@ -52,7 +51,7 @@ Facets can also filter your metrics by: - **Configuration**: Metrics with tag configurations - **Percentiles**: Distribution metrics enabled by percentiles/advanced query capabilities - **Historical Metrics**: Metrics that have historical metrics ingestion enabled -- **Query Activity** (Beta): Metrics not actively queried in the app or by the API in the past 30 days +- **Query Activity**: Metrics not actively queried in Datadog or by the API in the past 30, 60, or 90 days - **Metric Type**: Differentiate between distribution and non-distribution metrics (counts, gauges, rates) - **Distribution Metric Origin**: The product from which the metric originated (for example, metrics generated from Logs or APM Spans) @@ -60,11 +59,11 @@ The Volume page displays a list of your metrics reported to Datadog sorted by es | Column | Description | |--------|-------------| |**Top 500 Metric Names by Estimated Real-time Cardinality** | Identify the top 500 metric names by cardinality (aka custom metrics volume).| -|**Top 500 Metric Names by Change in Volume** |Discover the top 500 metric names that have the greatest variance in their cardinality. These metrics may have anomalously (potentially unintentionally) spiked in the timeframe of your choosing. If you receive an alert on your account's estimated real-time custom metrics usage, you can use this view to investigate the metric spike. | +|**Top 500 Metric Names by Change in Volume** |Discover the top 500 metric names that have the greatest variance in their cardinality. These metrics may have anomalously (potentially unintentionally) spiked in the time frame of your choosing. If you receive an alert on your account's estimated real-time custom metrics usage, you can use this view to investigate the metric spike. | ## Compare a metric's cardinality (volume) over time -{{< img src="metrics/volume/compare_metric_cardinality.png" alt="Metrics Volume filtered down to metric names with “shopist”, sorted by estimated custom metrics. On hover over the change in volume, displays the cardinality graph of the metric over the past day" style="width:100%;" >}} +{{< img src="metrics/volume/compare_metric_cardinality_2025-01-27.png" alt="Metrics Volume filtered down to metric names with “shopist”, sorted by estimated custom metrics" style="width:100%;" >}} When identifying the top 500 metric names by change in volume, you can hover over the number to compare a metric name's number of indexed custom metrics (its cardinality) over time. As a reminder, a single metric name can emit multiple indexed custom metrics. To learn more, see [Custom Metrics Billing][6]. @@ -75,13 +74,13 @@ To compare your spiking metric's cardinality over time: ## Identify less valuable, unqueried metrics -{{< img src="metrics/volume/id_unqueried_metrics.png" alt="Facet fields for Query Activity with the 'Not actively queried' facet selected" style="width:100%;" >}} +{{< img src="metrics/volume/id_unqueried_metrics_2025-01-23.png" alt="Facet fields for Query Activity with the 'Not queried in 90 days' facet selected" style="width:100%;" >}} -To start reducing custom metrics costs, start with your largest metric names that aren't actively queried. Datadog's intelligent query insights analyze your queries and surfaces your unqueried metrics over the past 30 days. This analysis is constantly running in the background ensuring that your unqueried metrics are always up-to-date. +To start reducing custom metrics costs, start with your largest metric names that aren't actively queried. Datadog's intelligent query insights analyze your queries and surfaces your unqueried metrics over the past 30, 60, or 90 days. This analysis is constantly running in the background ensuring that your unqueried metrics are always up-to-date. -To find the metrics not actively queried in the past 30 days, click on **Not Actively Queried** in the *Query Activity Facet* box. Selecting **Not Actively Queried** generates a list of unused metric names across dashboards, notebooks, monitors, SLOs, Metrics Explorer, and the API. +To find the metrics not actively queried, click the time frame of interest in the *Query Activity Facet* box. The list is filtered to show only unused metric names across dashboards, notebooks, monitors, SLOs, Metrics Explorer, and the API. -## How to quickly reduce metric volume and cost +## How to reduce metric volume and cost After you identify unqueried metrics, you can eliminate the volume and cost of these metric names by using [Metrics without Limits™][3] without a single line of code. By using Metrics without Limits, you ensure that you pay only for the metrics that you use by eliminating timeseries that are never or rarely leveraged. Use Metrics without Limits™ on your unqueried metric names to reduce custom metrics volume. @@ -99,9 +98,9 @@ In this example, the tag configuration modal shows a metric with a current volum {{< img src="metrics/volume/reduce_metric_vol_cost_tags.png" alt="Tag configuration modal showing an example metric with a current volume of 13690031 index metrics and an estimated new volume of 1, with an empty allowlist of tags" style="width:80%;" >}} ## Analyze metrics' utility and relative value in Datadog -Metrics without Limits™ allows you to quickly find metrics that are underused in Datadog with the Metrics Related Assets feature. A metrics related asset refers to any dashboard, notebook, monitor, or SLO that queries a particular metric. Datadog's intelligent query insights surface the popularity and quantity of these related assets so you can evaluate metric utility within your organization. Use this information to make data-driven decisions. Identify how your team can use existing metrics to get more value from your observability spend and reduce metric volume and cost. +Metrics without Limits™ allows you to find metrics that are underused in Datadog with the Metrics Related Assets feature. A metrics related asset refers to any dashboard, notebook, monitor, or SLO that queries a particular metric. Datadog's intelligent query insights surface the popularity and quantity of these related assets so you can evaluate metric utility within your organization. Use this information to make data-driven decisions. Identify how your team can use existing metrics to get more value from your observability spend and reduce metric volume and cost. -{{< img src="metrics/volume/related_assets.png" alt="Metric detail side panel showing the Related Assets section. The example metric is applied to one dashboard" style="width:100%;" >}} +{{< img src="metrics/related_assets_2025-01-27.png" alt="Metric detail side panel showing the Related Assets section. The example metric is applied to three dashboards" style="width:100%;" >}} To view a metric's related assets: 1. Click on the metric name to open its details side panel. diff --git a/static/images/metrics/guide/custom_metrics_governance/manage_tags_fm_metrics_summary_2025-01-27.png b/static/images/metrics/guide/custom_metrics_governance/manage_tags_fm_metrics_summary_2025-01-27.png new file mode 100644 index 0000000000000..1cafc96ce48cb Binary files /dev/null and b/static/images/metrics/guide/custom_metrics_governance/manage_tags_fm_metrics_summary_2025-01-27.png differ diff --git a/static/images/metrics/guide/custom_metrics_governance/volume_management_page_2025-01-27.png b/static/images/metrics/guide/custom_metrics_governance/volume_management_page_2025-01-27.png new file mode 100644 index 0000000000000..60feac6af2a81 Binary files /dev/null and b/static/images/metrics/guide/custom_metrics_governance/volume_management_page_2025-01-27.png differ diff --git a/static/images/metrics/related_assets_2025-01-27.png b/static/images/metrics/related_assets_2025-01-27.png new file mode 100644 index 0000000000000..bf8a01fe7c0b3 Binary files /dev/null and b/static/images/metrics/related_assets_2025-01-27.png differ diff --git a/static/images/metrics/summary/facet_panel.png b/static/images/metrics/summary/facet_panel.png new file mode 100644 index 0000000000000..911cd22d55977 Binary files /dev/null and b/static/images/metrics/summary/facet_panel.png differ diff --git a/static/images/metrics/summary/percentile_aggregations_toggle_2025-01-28.png b/static/images/metrics/summary/percentile_aggregations_toggle_2025-01-28.png new file mode 100644 index 0000000000000..27bafbbb7e2c6 Binary files /dev/null and b/static/images/metrics/summary/percentile_aggregations_toggle_2025-01-28.png differ diff --git a/static/images/metrics/summary/tag_advanced_filtering.png b/static/images/metrics/summary/tag_advanced_filtering.png new file mode 100644 index 0000000000000..7d75c1d19df04 Binary files /dev/null and b/static/images/metrics/summary/tag_advanced_filtering.png differ diff --git a/static/images/metrics/volume/compare_metric_cardinality_2025-01-27.png b/static/images/metrics/volume/compare_metric_cardinality_2025-01-27.png new file mode 100644 index 0000000000000..33df66f839a59 Binary files /dev/null and b/static/images/metrics/volume/compare_metric_cardinality_2025-01-27.png differ diff --git a/static/images/metrics/volume/id_unqueried_metrics_2025-01-23.png b/static/images/metrics/volume/id_unqueried_metrics_2025-01-23.png new file mode 100644 index 0000000000000..febb51b347afd Binary files /dev/null and b/static/images/metrics/volume/id_unqueried_metrics_2025-01-23.png differ diff --git a/static/images/metrics/volume/metrics_volume_overview_2025-01-27.png b/static/images/metrics/volume/metrics_volume_overview_2025-01-27.png new file mode 100644 index 0000000000000..daae50322580c Binary files /dev/null and b/static/images/metrics/volume/metrics_volume_overview_2025-01-27.png differ