You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been exploring Pyrra over the last few days and had some trouble getting the UI to work for me, so I've also been looking at the generic rules generation. This works well for us because we're already using Grafana for all of our observability stuff today.
I've deployed the following SLO:
apiVersion: pyrra.dev/v1alpha1kind: ServiceLevelObjectivemetadata:
name: test-api-playground-latencynamespace: monitoringlabels:
prometheus: k8srole: alert-rulesspec:
target: '95'window: 4wdescription: Service responded to 95% of requests within half a secondindicator:
latency:
success:
metric: http_server_requests_seconds_bucket{application="test-api", namespace="playground", le="0.5"}total:
metric: http_server_requests_seconds_bucket{application="test-api", namespace="playground", le="+Inf"}
The metric I'm using for this SLO is a Spring Boot histogram for request times. The generated PrometheusRule looks correct to me and almost all of the recording rules record data as expected.
However, the pyrra_availability generic metric recording rule gets generated with an extra le="" label, despite there already being an le label in the original query. Here's the generated availability recording rule:
If I run the query in expression manually in Grafana I get no data because of that le="" in the denominator. If I remove that label and run the following query, it returns data fine.
sum(http_server_requests_seconds:increase4w{application="test-api",le="0.5",namespace="playground",slo="test-api-playground-latency"}
or vector(0)) / sum(http_server_requests_seconds:increase4w{application="test-api",le="+Inf",namespace="playground",slo="test-api-playground-latency"})
This bit of code looks like it's responsible for building up the pyrra_availability generic rule for a latency SLO like mine, and I see where it's adding the le="" matcher. Unfortunately I don't understand why it's expecting there to be an empty le label. In the above PR for native histograms, there's a comment in that code that isn't in this one:
// Add the matcher {le=""} to select the recording rule that summed up all requests
I'm not sure I'm following the code enough to understand why it's expecting the recording rule to have an empty le label. Here are the generated increase recording rules for my SLO:
I've been exploring Pyrra over the last few days and had some trouble getting the UI to work for me, so I've also been looking at the generic rules generation. This works well for us because we're already using Grafana for all of our observability stuff today.
I've deployed the following SLO:
The metric I'm using for this SLO is a Spring Boot histogram for request times. The generated PrometheusRule looks correct to me and almost all of the recording rules record data as expected.
However, the
pyrra_availability
generic metric recording rule gets generated with an extrale=""
label, despite there already being anle
label in the original query. Here's the generated availability recording rule:If I run the query in expression manually in Grafana I get no data because of that
le=""
in the denominator. If I remove that label and run the following query, it returns data fine.I see there was some work related to native histograms that seems to have introduced the empty le filter, so maybe related.
The text was updated successfully, but these errors were encountered: