Skip to content

Commit 535a265

Browse files
committed
Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "perf tools maintainership: - Add git information for perf-tools and perf-tools-next trees and branches to the MAINTAINERS file. That is where development now takes place and myself and Namhyung Kim have write access, more people to come as we emulate other maintainer groups. perf record: - Record kernel data maps when 'perf record --data' is used, so that global variables can be resolved and used in tools that do data profiling. perf trace: - Remove the old, experimental support for BPF events in which a .c file was passed as an event: "perf trace -e hello.c" to then get compiled and loaded. The only known usage for that, that shipped with the kernel as an example for such events, augmented the raw_syscalls tracepoints and was converted to a libbpf skeleton, reusing all the user space components and the BPF code connected to the syscalls. In the end just the way to glue the BPF part and the user space type beautifiers changed, now being performed by libbpf skeletons. The next step is to use BTF to do pretty printing of all syscall types, as discussed with Alan Maguire and others. Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all path/filenames/strings, some of the networking data structures, perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5 seconds: # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5 0.000 ( 9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3 9.039 ( 0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 ? ( ): gpm/991 ... [continued]: clock_nanosleep()) = 0 10.133 ( ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ... ? ( ): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 30.276 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 30.276 (2000.394 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 1230.814 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 1230.814 (1000.404 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 2030.886 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 ? ( ): crond/1172 ... [continued]: clock_nanosleep()) = 0 3242.699 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 2030.886 (2000.385 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 3728.078 ( ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ... 3242.699 (1000.158 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 4031.409 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 10.133 (5000.375 ms): sleep/327642 ... [continued]: clock_nanosleep()) = 0 Performance counter stats for 'sleep 5': 2,617,347 cycles 1,855,997 instructions # 0.71 insn per cycle 5.002282128 seconds time elapsed 0.000855000 seconds user 0.000852000 seconds sys perf annotate: - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1) for licensing reasons, and we missed a build test on tools/perf/tests makefile. Since we now default to NDEBUG=1, we ended up segfaulting when building with BUILD_NONDISTRO=1 because a needed initialization routine was being "error checked" via an assert. Fix it by explicitly checking the result and aborting instead if it fails. We better back propagate the error, but at least 'perf annotate' on samples collected for a BPF program is back working when perf is built with BUILD_NONDISTRO=1. perf report/top: - Add back TUI hierarchy mode header, that is seen when using 'perf report/top --hierarchy'. - Fix the number of entries for 'e' key in the TUI that was preventing navigation of lines when expanding an entry. perf report/script: - Support cross platform register handling, allowing a perf.data file collected on one architecture to have registers sampled correctly displayed when analysis tools such as 'perf report' and 'perf script' are used on a different architecture. - Fix handling of event attributes in pipe mode, i.e. when one uses: perf record -o - | perf report -i - When no perf.data files are used. - Handle files generated via pipe mode with a version of perf and then read also via pipe mode with a different version of perf, where the event attr record may have changed, use the record size field to properly support this version mismatch. perf probe: - Accessing global variables from uprobes isn't supported, make the error message state that instead of stating that some minimal kernel version is needed to have that feature. This seems just a tool limitation, the kernel probably has all that is needed. perf tests: - Fix a reference count related leak in the dlfilter v0 API where the result of a thread__find_symbol_fb() is not matched with an addr_location__exit() to drop the reference counts of the resolved components (machine, thread, map, symbol, etc). Add a dlfilter test to make sure that doesn't regresses. - Lots of fixes for the 'perf test' written in shell script related to problems found with the shellcheck utility. - Fixes for 'perf test' shell scripts testing features enabled when perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf counters. - Add perf record sample filtering test, things like the following example, that gets implemented as a BPF filter attached to the event: # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000' - Improve the way the task_analyzer test checks if libtraceevent is linked, using 'perf version --build-options' instead of the more expensinve 'perf record -e "sched:sched_switch"'. - Add support for riscv in the mmap-basic test. (This went as well via the RiscV tree, same contents). libperf: - Implement riscv mmap support (This went as well via the RiscV tree, same contents). perf script: - New tool that converts perf.data files to the firefox profiler format so that one can use the visualizer at https://profiler.firefox.com/. Done by Anup Sharma as part of this year's Google Summer of Code. One can generate the output and upload it to the web interface but Anup also automated everything: perf script gecko -F 99 -a sleep 60 - Support syscall name parsing on arm64. - Print "cgroup" field on the same line as "comm". perf bench: - Add new 'uprobe' benchmark to measure the overhead of uprobes with/without BPF programs attached to it. - breakpoints are not available on power9, skip that test. perf stat: - Add #num_cpus_online literal to be used in 'perf stat' metrics, and add this extra 'perf test' check that exemplifies its purpose: TEST_ASSERT_VAL("#num_cpus_online", expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0); TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0); TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online); Miscellaneous: - Improve tool startup time by lazily reading PMU, JSON, sysfs data. - Improve error reporting in the parsing of events, passing YYLTYPE to error routines, so that the output can show were the parsing error was found. - Add 'perf test' entries to check the parsing of events improvements. - Fix various leak for things detected by -fsanitize=address, mostly things that would be freed at tool exit, including: - Free evsel->filter on the destructor. - Allow tools to register a thread->priv destructor and use it in 'perf trace'. - Free evsel->priv in 'perf trace'. - Free string returned by synthesize_perf_probe_point() when the caller fails to do all it needs. - Adjust various compiler options to not consider errors some warnings when building with broken headers found in things like python, flex, bison, as we otherwise build with -Werror. Some for gcc, some for clang, some for some specific version of those, some for some specific version of flex or bison, or some specific combination of these components, bah. - Allow customization of clang options for BPF target, this helps building on gentoo where there are other oddities where BPF targets gets passed some compiler options intended for the native build, so building with WERROR=0 helps while these oddities are fixed. - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top' and 'perf lock', fixing some segfaults when handling some odd failures. - Add LTO build option. - Fix format of unordered lists in the perf docs (tools/perf/Documentation) - Overhaul the bison files, using constructs such as YYNOMEM. - Remove unused tokens from the bison .y files. - Add more comments to various structs. - A few LoongArch enablement patches. Vendor events (JSON): - Add JSON metrics for Yitian 710 DDR (aarch64). Things like: EventName, BriefDescription visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.", visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.", op_is_dqsosc_mpc , "A DQS Oscillator MPC command to DRAM.", op_is_dqsosc_mrr , "A DQS Oscillator MRR command to DRAM.", op_is_tcr_mrr , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.", - Add AmpereOne metrics (aarch64). - Update N2 and V2 metrics (aarch64) and events using Arm telemetry repo. - Update scale units and descriptions of common topdown metrics on aarch64. Things like: - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)", - "BriefDescription": "Frontend bound L1 topdown metric", + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))", + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.", - Update events for intel: meteorlake to 1.04, sapphirerapids to 1.15, Icelake+ metric constraints. - Update files for the power10 platform" * tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits) perf parse-events: Fix driver config term perf parse-events: Fixes relating to no_value terms perf parse-events: Fix propagation of term's no_value when cloning perf parse-events: Name the two term enums perf list: Don't print Unit for "default_core" perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake perf dlfilter: Avoid leak in v0 API test use of resolve_address() perf metric: Add #num_cpus_online literal perf pmu: Remove str from perf_pmu_alias perf parse-events: Make common term list to strbuf helper perf parse-events: Minor help message improvements perf pmu: Avoid uninitialized use of alias->str perf jevents: Use "default_core" for events with no Unit perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test perf test shell stat_bpf_counters: Fix test on Intel perf test shell record_bpf_filter: Skip 6.2 kernel libperf: Get rid of attr.id field perf tools: Convert to perf_record_header_attr_id() libperf: Add perf_record_header_attr_id() perf tools: Handle old data in PERF_RECORD_ATTR ...
2 parents fd3a594 + 45fc462 commit 535a265

File tree

284 files changed

+8011
-9119
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

284 files changed

+8011
-9119
lines changed

Documentation/admin-guide/perf/alibaba_pmu.rst

+5
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,11 @@ data bandwidth::
8888
-e ali_drw_27080/hif_rmw/ \
8989
-e ali_drw_27080/cycle/ -- sleep 10
9090

91+
Example usage of counting all memory read/write bandwidth by metric::
92+
93+
perf stat -M ddr_read_bandwidth.all -- sleep 10
94+
perf stat -M ddr_write_bandwidth.all -- sleep 10
95+
9196
The average DRAM bandwidth can be calculated as follows:
9297

9398
- Read Bandwidth = perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle

MAINTAINERS

+2
Original file line numberDiff line numberDiff line change
@@ -16763,6 +16763,8 @@ L: [email protected]
1676316763
S: Supported
1676416764
W: https://perf.wiki.kernel.org/
1676516765
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/core
16766+
T: git git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git perf-tools
16767+
T: git git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git perf-tools-next
1676616768
F: arch/*/events/*
1676716769
F: arch/*/events/*/*
1676816770
F: arch/*/include/asm/perf_event.h

tools/build/Makefile.build

+10
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,16 @@ $(OUTPUT)%.s: %.c FORCE
117117
$(call rule_mkdir)
118118
$(call if_changed_dep,cc_s_c)
119119

120+
# bison and flex files are generated in the OUTPUT directory
121+
# so it needs a separate rule to depend on them properly
122+
$(OUTPUT)%-bison.o: $(OUTPUT)%-bison.c FORCE
123+
$(call rule_mkdir)
124+
$(call if_changed_dep,$(host)cc_o_c)
125+
126+
$(OUTPUT)%-flex.o: $(OUTPUT)%-flex.c FORCE
127+
$(call rule_mkdir)
128+
$(call if_changed_dep,$(host)cc_o_c)
129+
120130
# Gather build data:
121131
# obj-y - list of build objects
122132
# subdir-y - list of directories to nest

tools/build/feature/Makefile

+4-6
Original file line numberDiff line numberDiff line change
@@ -340,25 +340,23 @@ $(OUTPUT)test-jvmti-cmlr.bin:
340340
$(BUILD)
341341

342342
$(OUTPUT)test-llvm.bin:
343-
$(BUILDXX) -std=gnu++14 \
343+
$(BUILDXX) -std=gnu++17 \
344344
-I$(shell $(LLVM_CONFIG) --includedir) \
345345
-L$(shell $(LLVM_CONFIG) --libdir) \
346346
$(shell $(LLVM_CONFIG) --libs Core BPF) \
347347
$(shell $(LLVM_CONFIG) --system-libs) \
348348
> $(@:.bin=.make.output) 2>&1
349349

350350
$(OUTPUT)test-llvm-version.bin:
351-
$(BUILDXX) -std=gnu++14 \
351+
$(BUILDXX) -std=gnu++17 \
352352
-I$(shell $(LLVM_CONFIG) --includedir) \
353353
> $(@:.bin=.make.output) 2>&1
354354

355355
$(OUTPUT)test-clang.bin:
356-
$(BUILDXX) -std=gnu++14 \
356+
$(BUILDXX) -std=gnu++17 \
357357
-I$(shell $(LLVM_CONFIG) --includedir) \
358358
-L$(shell $(LLVM_CONFIG) --libdir) \
359-
-Wl,--start-group -lclangBasic -lclangDriver \
360-
-lclangFrontend -lclangEdit -lclangLex \
361-
-lclangAST -Wl,--end-group \
359+
-Wl,--start-group -lclang-cpp -Wl,--end-group \
362360
$(shell $(LLVM_CONFIG) --libs Core option) \
363361
$(shell $(LLVM_CONFIG) --system-libs) \
364362
> $(@:.bin=.make.output) 2>&1

tools/build/feature/test-clang.cpp

-28
This file was deleted.

tools/build/feature/test-cxx.cpp

-16
This file was deleted.

tools/build/feature/test-llvm-version.cpp

-12
This file was deleted.

tools/build/feature/test-llvm.cpp

-14
This file was deleted.

tools/lib/perf/include/perf/event.h

+12-2
Original file line numberDiff line numberDiff line change
@@ -148,8 +148,18 @@ struct perf_record_switch {
148148
struct perf_record_header_attr {
149149
struct perf_event_header header;
150150
struct perf_event_attr attr;
151-
__u64 id[];
152-
};
151+
/*
152+
* Array of u64 id follows here but we cannot use a flexible array
153+
* because size of attr in the data can be different then current
154+
* version. Please use perf_record_header_attr_id() below.
155+
*
156+
* __u64 id[]; // do not use this
157+
*/
158+
};
159+
160+
/* Returns the pointer to id array based on the actual attr size. */
161+
#define perf_record_header_attr_id(evt) \
162+
((void *)&(evt)->attr.attr + (evt)->attr.attr.size)
153163

154164
enum {
155165
PERF_CPU_MAP__CPUS = 0,

tools/perf/Documentation/perf-bench.txt

+3
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,9 @@ SUBSYSTEM
6767
'internals'::
6868
Benchmark internal perf functionality.
6969

70+
'uprobe'::
71+
Benchmark overhead of uprobe + BPF.
72+
7073
'all'::
7174
All benchmark subsystems.
7275

tools/perf/Documentation/perf-config.txt

-33
Original file line numberDiff line numberDiff line change
@@ -125,9 +125,6 @@ Given a $HOME/.perfconfig like this:
125125
group = true
126126
skip-empty = true
127127

128-
[llvm]
129-
dump-obj = true
130-
clang-opt = -g
131128

132129
You can hide source code of annotate feature setting the config to false with
133130

@@ -657,36 +654,6 @@ ftrace.*::
657654
-F option is not specified. Possible values are 'function' and
658655
'function_graph'.
659656

660-
llvm.*::
661-
llvm.clang-path::
662-
Path to clang. If omit, search it from $PATH.
663-
664-
llvm.clang-bpf-cmd-template::
665-
Cmdline template. Below lines show its default value. Environment
666-
variable is used to pass options.
667-
"$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\
668-
"-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \
669-
"$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \
670-
"-Wno-unused-value -Wno-pointer-sign " \
671-
"-working-directory $WORKING_DIR " \
672-
"-c \"$CLANG_SOURCE\" --target=bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE"
673-
674-
llvm.clang-opt::
675-
Options passed to clang.
676-
677-
llvm.kbuild-dir::
678-
kbuild directory. If not set, use /lib/modules/`uname -r`/build.
679-
If set to "" deliberately, skip kernel header auto-detector.
680-
681-
llvm.kbuild-opts::
682-
Options passed to 'make' when detecting kernel header options.
683-
684-
llvm.dump-obj::
685-
Enable perf dump BPF object files compiled by LLVM.
686-
687-
llvm.opts::
688-
Options passed to llc.
689-
690657
samples.*::
691658

692659
samples.context::

tools/perf/Documentation/perf-dlfilter.txt

+20-2
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,12 @@ internal filtering.
6464
If implemented, 'filter_description' should return a one-line description
6565
of the filter, and optionally a longer description.
6666

67+
Do not assume the 'sample' argument is valid (dereferenceable)
68+
after 'filter_event' and 'filter_event_early' return.
69+
70+
Do not assume data referenced by pointers in struct perf_dlfilter_sample
71+
is valid (dereferenceable) after 'filter_event' and 'filter_event_early' return.
72+
6773
The perf_dlfilter_sample structure
6874
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
6975

@@ -150,7 +156,8 @@ struct perf_dlfilter_fns {
150156
const char *(*srcline)(void *ctx, __u32 *line_number);
151157
struct perf_event_attr *(*attr)(void *ctx);
152158
__s32 (*object_code)(void *ctx, __u64 ip, void *buf, __u32 len);
153-
void *(*reserved[120])(void *);
159+
void (*al_cleanup)(void *ctx, struct perf_dlfilter_al *al);
160+
void *(*reserved[119])(void *);
154161
};
155162
----
156163

@@ -161,7 +168,8 @@ struct perf_dlfilter_fns {
161168
'args' returns arguments from --dlarg options.
162169

163170
'resolve_address' provides information about 'address'. al->size must be set
164-
before calling. Returns 0 on success, -1 otherwise.
171+
before calling. Returns 0 on success, -1 otherwise. Call al_cleanup() (if present,
172+
see below) when 'al' data is no longer needed.
165173

166174
'insn' returns instruction bytes and length.
167175

@@ -171,6 +179,12 @@ before calling. Returns 0 on success, -1 otherwise.
171179

172180
'object_code' reads object code and returns the number of bytes read.
173181

182+
'al_cleanup' must be called (if present, so check perf_dlfilter_fns.al_cleanup != NULL)
183+
after resolve_address() to free any associated resources.
184+
185+
Do not assume pointers obtained via perf_dlfilter_fns are valid (dereferenceable)
186+
after 'filter_event' and 'filter_event_early' return.
187+
174188
The perf_dlfilter_al structure
175189
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
176190

@@ -197,9 +211,13 @@ struct perf_dlfilter_al {
197211
/* Below members are only populated by resolve_ip() */
198212
__u8 filtered; /* true if this sample event will be filtered out */
199213
const char *comm;
214+
void *priv; /* Private data. Do not change */
200215
};
201216
----
202217

218+
Do not assume data referenced by pointers in struct perf_dlfilter_al
219+
is valid (dereferenceable) after 'filter_event' and 'filter_event_early' return.
220+
203221
perf_dlfilter_sample flags
204222
~~~~~~~~~~~~~~~~~~~~~~~~~~
205223

tools/perf/Documentation/perf-ftrace.txt

+9-7
Original file line numberDiff line numberDiff line change
@@ -96,8 +96,9 @@ OPTIONS for 'perf ftrace trace'
9696

9797
--func-opts::
9898
List of options allowed to set:
99-
call-graph - Display kernel stack trace for function tracer.
100-
irq-info - Display irq context info for function tracer.
99+
100+
- call-graph - Display kernel stack trace for function tracer.
101+
- irq-info - Display irq context info for function tracer.
101102

102103
-G::
103104
--graph-funcs=::
@@ -118,11 +119,12 @@ OPTIONS for 'perf ftrace trace'
118119

119120
--graph-opts::
120121
List of options allowed to set:
121-
nosleep-time - Measure on-CPU time only for function_graph tracer.
122-
noirqs - Ignore functions that happen inside interrupt.
123-
verbose - Show process names, PIDs, timestamps, etc.
124-
thresh=<n> - Setup trace duration threshold in microseconds.
125-
depth=<n> - Set max depth for function graph tracer to follow.
122+
123+
- nosleep-time - Measure on-CPU time only for function_graph tracer.
124+
- noirqs - Ignore functions that happen inside interrupt.
125+
- verbose - Show process names, PIDs, timestamps, etc.
126+
- thresh=<n> - Setup trace duration threshold in microseconds.
127+
- depth=<n> - Set max depth for function graph tracer to follow.
126128

127129

128130
OPTIONS for 'perf ftrace latency'

0 commit comments

Comments
 (0)