Skip to content

toplev kernel support

Andi Kleen edited this page Aug 22, 2023 · 22 revisions

Kernel patches/workarounds needed for toplev

toplev uses many events and is quite demanding on the kernel PMU code, so it may need kernel updates or bug work arounds to fix issues in the perf driver. The only way to update the perf driver is to update the kernel.

toplev automatically disables events that appear to not be properly supported by the current kernel. This can be overridden with the --force-events option (but that may lead to wrong results). The disabling heuristic looks at the kernel version and does not handle kernel patch backports.

Known issues are documented here.

Perf tool broken with 6.3 and 6.4

The perf tool between 6.3 and 6.4 reorders events, which leads to "WARNING: events were regrouped" and "input corruption" errors. In some cases it can be made to work with --tune "KEEP_UNREF = False", but not all. Currently this needs a older (6.2 or older) or newer perf tool (6.5+) to avoid. Perf tools can be downloaded from https://mirrors.edge.kernel.org/pub/linux/kernel/tools/perf/

perf stat record

Using --script-record with toplev requires perf 5.11+. No kernel support needed.

Ice lake support

Icelake support needs at least v5.2. On kernels before that there will be likely mismeasurements. Fixed metrics support is expected to be supported in v5.10, however toplev works without that with more multiplexing.

Up to 4.6

  • 4.6 adds Skylake server support. This is only needed for sampling.

Up to 4.3

  • 4.3 adds Skylake support. However except for PEBS sampling Skylake counting should work with older kernels.
  • On Broadwell, The CYCLE_ACTIVITY.* event has a too broad constraint, breaking measurements of multiple CYCLE_ACTIVITY.* events in the same group. This happens currently with -l2. Workaround is to use --no-group or apply this patch. Workaround in toplev under investigation.
  • The 4.3 Skylake code is missing support for the new FRONTEND MSR, so some sampling events for Frontend Issues will not work. Patches available here and here.

Up to Linux 4.1

  • All previous issues have been fixed
  • SMT support on IvyBridge/Haswell uses exclusion between threads for some events, which may affect the measurement accuracy.

Issues affecting up to Linux 3.19:

  • On IvyBridge, some level 3 memory nodes are not supported because their events have been disabled in the kernel since 3.9. The current workaround is to revert the guilty commit or apply this simple patch. Eventually this should be fixed with this complicated patch kit. The patch kit is currently queued for 4.1.

  • On Haswell the level 3 memory nodes will not schedule correctly and may give not counted. Patch is currently pending for 4.1.

Issues independent of the Linux kernel:

  • toplev requires a reasonably recent perf tool. The perf tools is independent of the kernel and can be updated. You can use the PERF=... environment variable to point to the tool to use. The one in original 2.6.32 based RHEL6 is too old for example. If that is a problem get a recent kernel source and build the perf tool in tools/perf and use that binary:

      set up proxy with export https_proxy=... as needed
    
      git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
      cd linux/tools/perf
      make
      cp perf /some/where/else
    

When building perf may tell you it disabled some dependencies. Basic counting toplev use does not need any of these dependencies. But they can be useful when doing sampling. In this case you can install the packages (yum/zypper/apt-get/... install name)

  • Debian and Ubuntu systems have a very broken perf packaging set up where the perf binary is incorrectly tied to the kernel version. /usr/bin/perf is a wrapper that calls a version specific perf binary. This implies that if you update the kernel perf complains. This is completely unnecessary. Either use the procedure above and cp that perf over the wrapper or just bypass it by copying the installed perf.

  • The stddev tracking feature (-rXXX) currently requires a tip tree perf.

      git clone https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
      git checkout perf/core
      cd tools/perf
      make
    

This should be fixed in Linux 4.1.

Issues affecting older systems (before Linux 3.17)

  • On Haswell the False_Sharing and Contested accesses nodes may not be supported. In this case you need this bug fix or 3.17+

  • Haswell needs at least kernel 3.11 for Level 1. Technically most Haswells are fine with 3.10, but Haswell Y needs 3.11. toplev currently does not distinguish this case. Some additional fixes were added in later versions see above.

  • On IvyBridge Level 2 may not measure memory bound correctly without this commit to fix the counter constraints. Available in 3.12+

  • IvyBridge basic support needs 3.12.

  • On Sandy Bridge Level 3 memory events did not schedule correctly before 3.9. Was fixed with [this commit] (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/Documentation?id=f8378f5259647710f0b4ecb814b0a1b0d9040de0)

  • On really old kernels (2.6.32) various groups generated by toplev did not schedule correctly and were not counted. The workaround is to use --no-group. In this case it is recommended to avoid any PMU sharing, and some values may be mis-measured. These old kernels do not schedule all counters correctly, so some nodes (including Level 1) will give incorrect results.

Distribution kernels

Some long term maintained distributions backported perf support for newer CPUs to older kernel version. Since toplev checks the kernel version it cannot detect this case without --force-events. The backport may be not complete and some later bug fixes may be missing.