-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: collector.arp.netlink is super slow to retrieve metrics #3075
Comments
you are not running in docker, it's the maybe it's dial cost too many time node_exporter/collector/arp_linux.go Lines 79 to 83 in b9d0932
|
The pprof SVG appears to be truncated, and only shows the collection taking 1.84s (which is admittedly still kinda slow). I can't think of a reason off the top of my head for the rtnetlink variant of the arp collector to be slow. If anything, it should be faster than parsing Can you please attach a complete pprof, so that it reveals where that ~1.8s is being consumed by the arpCollector? The gaps in your Grafana graphs also suggest that you're getting the occasional scrape timeout. Is anything unusual logged when that happens? |
Goign to close this due to inactivity. @maxime1907 feel free to re-open and attach the full pprof if this is still an issue |
Sorry for the late response i was busy these days and yes this is still an issue @chengjoey
@dswarbrick i think that the file being truncated is just github having some problems to render the whole svg but you can right click then save to see the full content |
@maxime1907 Strange, clicking on the SVG images still shows them truncated. I can save the first one locally and it appears complete when I open the locally saved copy. If I try to save the second one, I get a zero-byte file. GitHub seems to be really messing with them. The 1.98s consumed by Syscall6 is not solely caused by the ARP netlink collector, since all IO operations (including basic file IO) also ultimately go via a syscall to the kernel, and node_exporter does a lot of file IO when parsing the various procfs / sysfs entries. To really focus solely on the ARP collector and eliminate any other "noise" from the other collectors that ultimately culminate in a syscall, you would need to disable all other collectors and only enable the ARP collector - and then try switching between the netlink / procfs variants of the ARP collector. Everything beyond the syscall is kernel-space, so outside the visibility of pprof. Do you have a very large number of network interfaces on the system, or a very large number of ARP entries in the cache? How long does it take to list ARP entries via a command like Are you able to try running a local build from the current master branch? I also don't want to rule out the possibility of a compiler quirk. A syscall of this nature should be extremely fast. |
@maxime1907 Can you also try running node_exporter with
|
Please post the pprof raw data to pprof.me. It's much easier to examine than the images. |
Ok, on a small test system with half a dozen ARP entries, I see that the netlink method is measurably slower:
I don't have enough ARP entries to conclusively say whether the execution time scales proportionally to the number of ARP entries. However I doubt it, since the number of netlink operations to get the neighbors is constant. The only thing that is variable is the ifIndexEntries := make(map[uint32]uint32)
for _, n := range neighbors {
// Neighbors will also contain IPv6 neighbors, but since this is purely an ARP collector,
// restrict to AF_INET. Also skip entries which have state NUD_NOARP to conform to output
// of /proc/net/arp.
if n.Family == unix.AF_INET && n.State&unix.NUD_NOARP == 0 {
ifIndexEntries[n.Index]++
}
}
enumEntries := make(map[string]uint32)
// Convert interface indexes to names.
for ifIndex, entryCount := range ifIndexEntries {
iface, err := net.InterfaceByIndex(int(ifIndex))
if err != nil {
if errors.Unwrap(err).Error() == "no such network interface" {
continue
}
return nil, err
}
enumEntries[iface.Name] = entryCount
} Also tested with current master branch, and both Go 1.22.7 and 1.23.1. If I spam my ARP cache with (failed) entries, e.g., If I fudge 50K interfaces in the code to force a large number of for x := 1; x < 50000; x++ {
ifIndexEntries[uint32(x)] = 42
}
That's still a long way from ~1.98s however, and I strongly doubt that many systems will have that many interfaces. |
@dswarbrick $ time ip -4 neigh | wc -l
278
real 0m0.011s
user 0m0.001s
sys 0m0.012s About the number of interfaces its not that much $ ip -o link show | wc -l
182 From the grafana dashboards in the issue description we can see that it take less than 1 second when arp.netlink is disabled and enabled it takes more than 20 seconds I will try executing it with --debug tomorrow to see if it differs from our graphs @SuperQ |
@maxime1907 Something is really fishy if your scrapes are taking 20s with the ARP netlink collector enabled. Neither your number of ARP entries nor your number of interfaces would be sufficient to require more than a mere fraction of a second to collect via netlink. The timing of Since I'm starting to run out of ideas what could cause such slow scrapes, what else can you tell us about the container environment in which node_exporter is running? Is there an selinux policy which could be interfering with netlink sockets? Something possibly worth trying is an strace with just the ARP collector enable, filtering just the netlink-related stuff. For example:
The |
And just to play devil's advocate, it's not unheard of for drivers to cause long delays in netlink sockets: https://unix.stackexchange.com/questions/584401/what-could-cause-af-netlink-socket-transactions-to-take-many-seconds Since this issue is the first that I've heard of where the ARP netlink collector is misbehaving, I would be highly suspicious of the environment, rather than the code. |
@dswarbrick i tried only executing the arp collector and still the same thing here is the full strace output with the command you provided |
@maxime1907 Thanks for providing the strace output. That really clarified what is going on behind the scenes. I don't know if that output was just a single scrape, or whether you let strace run for a while, but I was a bit shocked to see roughly 18K syscalls in that output. It's no wonder that Go is spending so much time in Syscall6. After digging through some of Go's net package source, it's apparent why there is such a huge amount of rtnetlink traffic in your strace. The node_exporter arp collector calls net.InterfaceByIndex(index int) to resolve an interface index to an interface name, once for each interface index that it encounters in the neighbor list. Behind the scenes, Go is eventually calling this private function: // If the ifindex is zero, interfaceTable returns mappings of all
// network interfaces. Otherwise it returns a mapping of a specific
// interface.
func interfaceTable(ifindex int) ([]Interface, error) {
tab, err := syscall.NetlinkRIB(syscall.RTM_GETLINK, syscall.AF_UNSPEC)
...
if ifindex == 0 || ifindex == int(ifim.Index) {
// (add to []interface result) Despite the fact that this function takes an There is a similar function in the jsimonetti/rtnetlink package, https://pkg.go.dev/github.com/jsimonetti/rtnetlink#LinkService.Get, which does pass the interface index to the rtnetlink syscall, resulting in a much more concise reply. Here is a trace of the Go net.InterfaceByIndex() call:
and here is jsimonetti/rtnetlink's equivalent:
The jsimonetti/rtnetlink method is a larger rtnetlink request, because it specifies the interface index by which we want to filter (note the I count 183 It appears that the kernel is packing two interfaces into each
All that, multiplied by 183, works out to ~17K syscalls, more or less in line with the log that you supplied. The amount of rtnetlink response traffic will increase exponentially with the number of interfaces in the system. So, we have two options:
In terms of context switching penalty, the latter option would be preferable. |
Fixes: prometheus#3075 Signed-off-by: Daniel Swarbrick <[email protected]>
github.com/jsimonetti/rtnetlink provides a high level rtnl wrapper around the lower level rtnetlink functions, which essentially does all that we need. The rtnl.Conn.Neighbors uses an internal cache for resolving interface indexes to names, so it makes at most one rtnetlink call per interface to resolve the name. Using this high level wrapper hugely simplifies our code and makes it easier to understand and maintain. Fixes: prometheus#3075 Signed-off-by: Daniel Swarbrick <[email protected]>
github.com/jsimonetti/rtnetlink provides a high level rtnl wrapper around the lower level rtnetlink functions, which essentially does all that we need. The rtnl.Conn.Neighbors uses an internal cache for resolving interface indexes to names, so it makes at most one rtnetlink call per interface to resolve the name. Using this high level wrapper hugely simplifies our code and makes it easier to understand and maintain. Fixes: #3075 Signed-off-by: Daniel Swarbrick <[email protected]>
Host operating system: output of
uname -a
node_exporter version: output of
node_exporter --version
node_exporter command line flags
node_exporter log output
Are you running node_exporter in Docker?
No, we use containerd directly
What did you do that produced an error?
Run node exporter which uses by default the collector collector.arp.netlink
What did you expect to see?
Normal scrape times like 300ms
What did you see instead?
Scrape times of more than 20 seconds
Here is a graph of before and after the deactivation of the arp netlink collector with this flag:
Here is the output of pprof with:
The text was updated successfully, but these errors were encountered: