Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache JMX connection and MBeans names and attributes #1019

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

max-melentyev
Copy link

Hey, I saw significant improvement with jmx connection cached. Here is a chart for sample cassandra cluster without any traffic, orange line is for an instance with patched exporter:
image
Cassandra Clusters with traffic have GC time and throughput decreased, and we saw latencies improved up to 50%. But this depends on the traffic patters and vary from cluster to cluster.

Implementation details:

  • scraper is cached in collector's config instance, so it's replaced every time config is reloaded.
  • jmx connection is cached in scraper, and it uses Cleaner API to close it when scraper is discarded.
  • MBeans names and attributes are cached in scraper. JMX client subscribes to MBean register/unregister events to invalidate this cache.
  • JMX is reconnected when any exception occurs. I tested only in javaagent mode, so didn't have a chance to test reconnection.

@dhoard
Copy link
Collaborator

dhoard commented Nov 5, 2024

@max-melentyev Thanks for the PR! I'll need to review it.

@dhoard
Copy link
Collaborator

dhoard commented Dec 22, 2024

@max-melentyev This PR changes a lot and makes some assumptions that I don't feel are always valid.

An MBeanServerConnection can register for MBean registration/unregistration notification events, but the PR doesn't appear to handle the scenario where a DynamicMBean with dynamic attributes is in use.

Thoughts?

@max-melentyev
Copy link
Author

Hey, I'm not very familiar with java. I've tested this PR with cassandra and it produced the same result as a version from trunk. Cassandra adds some new metrics during bootstrap and when new tables created so without this event handler some metrics were missing. If there's another way MBean can start generating different set of metrics without giving a notification, then it's probably not handled by the code.

There are other issues with this code though:

  • Looks like local jmx connection uses singleton object, so when config is reloaded every handler is registered on the same object and they got stacked. So old handlers should be unregistered explicitly. And this most likely prevents scraper from being GC'ed.
  • Cleaner api doesn't work with java 8, agent fails to load.

I started working on a fix, but we decided to use https://github.com/k8ssandra/management-api-for-apache-cassandra which showed the same performance as jmx exporter with this patch. Please let me know if you want me to push my WIP fix, or feel free to close the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants