Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rogue label le="" showing up in one generic rule #1398

Open
mtthwcmpbll opened this issue Jan 29, 2025 · 1 comment
Open

Rogue label le="" showing up in one generic rule #1398

mtthwcmpbll opened this issue Jan 29, 2025 · 1 comment

Comments

@mtthwcmpbll
Copy link

I've been exploring Pyrra over the last few days and had some trouble getting the UI to work for me, so I've also been looking at the generic rules generation. This works well for us because we're already using Grafana for all of our observability stuff today.

I've deployed the following SLO:

apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  name: test-api-playground-latency
  namespace: monitoring
  labels:
    prometheus: k8s
    role: alert-rules
spec:
  target: '95'
  window: 4w
  description: Service responded to 95% of requests within half a second
  indicator:
    latency:
      success:
        metric: http_server_requests_seconds_bucket{application="test-api", namespace="playground", le="0.5"}
      total:
        metric: http_server_requests_seconds_bucket{application="test-api", namespace="playground", le="+Inf"}

The metric I'm using for this SLO is a Spring Boot histogram for request times. The generated PrometheusRule looks correct to me and almost all of the recording rules record data as expected.

However, the pyrra_availability generic metric recording rule gets generated with an extra le="" label, despite there already being an le label in the original query. Here's the generated availability recording rule:

- expr: sum(http_server_requests_seconds:increase4w{application="test-api",le="0.5",namespace="playground",slo="test-api-playground-latency"}
      or vector(0)) / sum(http_server_requests_seconds:increase4w{application="test-api",le="",le="+Inf",namespace="playground",slo="test-api-playground-latency"})
  labels:
    slo: test-api-playground-latency
  record: pyrra_availability

If I run the query in expression manually in Grafana I get no data because of that le="" in the denominator. If I remove that label and run the following query, it returns data fine.

sum(http_server_requests_seconds:increase4w{application="test-api",le="0.5",namespace="playground",slo="test-api-playground-latency"}
      or vector(0)) / sum(http_server_requests_seconds:increase4w{application="test-api",le="+Inf",namespace="playground",slo="test-api-playground-latency"})

I see there was some work related to native histograms that seems to have introduced the empty le filter, so maybe related.

@mtthwcmpbll
Copy link
Author

mtthwcmpbll commented Jan 29, 2025

This bit of code looks like it's responsible for building up the pyrra_availability generic rule for a latency SLO like mine, and I see where it's adding the le="" matcher. Unfortunately I don't understand why it's expecting there to be an empty le label. In the above PR for native histograms, there's a comment in that code that isn't in this one:

// Add the matcher {le=""} to select the recording rule that summed up all requests

I'm not sure I'm following the code enough to understand why it's expecting the recording rule to have an empty le label. Here are the generated increase recording rules for my SLO:

- expr: sum(increase(http_server_requests_seconds_bucket{application="test-api",le="+Inf",namespace="playground"}[4w]))
  labels:
    application: test-api
    le: +Inf
    namespace: playground
    slo: test-api-playground-latency
  record: http_server_requests_seconds:increase4w
- expr: sum(increase(http_server_requests_seconds_bucket{application="test-api",le="0.5",namespace="playground"}[4w]))
  labels:
    application: test-api
    le: +Inf
    namespace: playground
    slo: test-api-playground-latency
  record: http_server_requests_seconds:increase4w

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant