Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow disabling of server label in all metrics #1

Open
keithf4 opened this issue Nov 22, 2019 · 4 comments
Open

Allow disabling of server label in all metrics #1

keithf4 opened this issue Nov 22, 2019 · 4 comments

Comments

@keithf4
Copy link

keithf4 commented Nov 22, 2019

Version 0.5.x of postgres_exporter enabled support for multi-server connections from a single exporter. This necessitated a need to be able to distinguish different servers on a per metric basis, so the server label was introduced. However, with our current implementation of pgmonitor, this breaks things in several ways.

Proposed fix is to add a --disable_server_label flag to the exporter startup to remove this new label.

We do not currently connect to multiple pg clusters from the same exporter, so not having the label doesn't break anything for us and fixes several issues we're encountering:

  1. The biggest issue is that the pg_up metric does not have this label applied to it. This breaks our Overview dashboard which does some math along with the replication status metric to determine if a system is up and what it's type is (Primary/Replica). Math can only be done between metrics with matching label sets. I've opened an issue on the postgres_exporter repo itself about this. Maybe we could see about fixing it ourselves as well in another PR?

  2. When upgrading the postgres_exporter to 0.5.x and higher, this also causes all graphs to have their line colors changed and puts duplicated entries into the legend. This is because Prometheus presents a metric with a different label set as a unique metric. This also breaks our single panel metric for pgbackrest due to it causes a multi-series error because it's presenting the same named metric twice to a single metric panel.

The second problem above is only really an issue on upgrades, but it means it would be an ongoing issue until the "old" metrics before the upgrade were phased out. The first issue is a problem even on brand new installs.

@abrightwell
Copy link
Member

Assuming that Promtheus is the target monitoring system, could you not simply configure a scrape config to drop the label from metrics?

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config

For instance:

scrape_configs:
- job_name: foo
  
  metric_relabel_configs:
  - regex: 'server'
    action: labeldrop

@keithf4
Copy link
Author

keithf4 commented Nov 22, 2019

You could and we did that already. However, unless you do the upgrade in a very specific order (prometheus config update first, then postgres_exporter), you end up with a period of about 5-10 minutes where prometheus is grabbing metrics with the server label. This is currently tricky to require on the HA side of things.

Also, the fact that the pg_up variable completely breaks our overview is a big problem as well. If that wasn't there, we may just be able to say "as of this point forward, all metrics get this new label" and be done with it. Perhaps if they fix that in the future, we can have a release where we bring in support for the server label.

@keithf4
Copy link
Author

keithf4 commented Nov 22, 2019

Actually had to do that slightly differently as well to only drop the label on our own postgres_exporter metrics and not possibly interfere with other exporters people may be running

https://github.com/keithf4/pgmonitor/blob/3.3rc1/prometheus/crunchy-prometheus.yml#L34

    - source_labels: [__name__, server] 
      regex: "ccp_.*;.+" 
      action: replace 
      target_label: server 
      replacement: ""

@abrightwell
Copy link
Member

Yeah, that makes sense for pg_up. I suppose you could possibly set up a separate scrape config for 'your' postgres_exporters, so that it's only concerned about your labels? Though, it seems like the <filename_pattern> on the file_sd_configs is pretty limited and might not be able to handle all your cases. 🤔 That's unfortunate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants