Skip to content

Commit 314194e

Browse files
rustc-book: Add documentation on how to use PGO.
1 parent dbec74f commit 314194e

File tree

3 files changed

+154
-0
lines changed

3 files changed

+154
-0
lines changed

src/doc/rustc/src/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,6 @@
1313
- [Targets](targets/index.md)
1414
- [Built-in Targets](targets/built-in.md)
1515
- [Custom Targets](targets/custom.md)
16+
- [Profile-guided Optimization](profile-guided-optimization.md)
1617
- [Linker-plugin based LTO](linker-plugin-lto.md)
1718
- [Contributing to `rustc`](contributing.md)

src/doc/rustc/src/codegen-options/index.md

+17
Original file line numberDiff line numberDiff line change
@@ -214,3 +214,20 @@ This option lets you control what happens when the code panics.
214214
## incremental
215215

216216
This flag allows you to enable incremental compilation.
217+
218+
## profile-generate
219+
220+
This flag allows for creating instrumented binaries that will collect
221+
profiling data for use with profile-guided optimization (PGO). The flag takes
222+
an optional argument which is the path to a directory into which the
223+
instrumented binary will emit the collected data. See the chapter on
224+
[profile-guided optimization](profile-guided-optimization.html) for more
225+
information.
226+
227+
## profile-use
228+
229+
This flag specifies the profiling data file to be used for profile-guided
230+
optimization (PGO). The flag takes a mandatory argument which is the path
231+
to a valid `.profdata` file. See the chapter on
232+
[profile-guided optimization](profile-guided-optimization.html) for more
233+
information.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# Profile Guided Optimization
2+
3+
`rustc` supports doing profile-guided optimization (PGO).
4+
This chapter describes what PGO is, what it is good for, and how it can be used.
5+
6+
## What Is Profiled-Guided Optimization?
7+
8+
The basic concept of PGO is to collect data about the typical execution of
9+
a program (e.g. which branches it is likely to take) and then use this data
10+
to inform optimizations such as inlining, machine-code layout,
11+
register allocation, etc.
12+
13+
There are different ways of collecting data about a program's execution.
14+
One is to run the program inside a profiler (such as `perf`) and another
15+
is to create an instrumented binary, that is, a binary that has data
16+
collection built into it, and run that.
17+
The latter usually provides more accurate data and it is also what is
18+
supported by `rustc`.
19+
20+
## Usage
21+
22+
Generating a PGO-optimized program involves following a workflow with four steps:
23+
24+
1. Compile the program with instrumentation enabled
25+
(e.g. `rustc -Cprofile-generate=/tmp/pgo-data main.rs`)
26+
2. Run the instrumented program (e.g. `./main`) which generates a
27+
`default_<id>.profraw` file
28+
3. Convert the `.profraw` file into a `.profdata` file using
29+
LLVM's `llvm-profdata` tool
30+
4. Compile the program again, this time making use of the profiling data
31+
(for example `rustc -Cprofile-use=merged.profdata main.rs`)
32+
33+
An instrumented program will create one or more `.profraw` files, one for each
34+
instrumented binary. E.g. an instrumented executable that loads two instrumented
35+
dynamic libraries at runtime will generate three `.profraw` files. Running an
36+
instrumented binary multiple times, on the other hand, will re-use the
37+
respective `.profraw` files, updating them in place.
38+
39+
These `.profraw` files have to be post-processed before they can be fed back
40+
into the compiler. This is done by the `llvm-profdata` tool. This tool
41+
is most easily installed via
42+
43+
```bash
44+
rustup component add llvm-tools-preview
45+
```
46+
47+
Note that installing the `llvm-tools-preview` component won't add
48+
`llvm-profdata` to the `PATH`. Rather, the tool can be found in:
49+
50+
```bash
51+
~/.rustup/toolchains/<toolchain>/lib/rustlib/<target-triple>/bin/
52+
```
53+
54+
Alternatively, an `llvm-profdata` coming with a recent LLVM or Clang
55+
version usually works too.
56+
57+
The `llvm-profdata` tool merges multiple `.profraw` files into a single
58+
`.profdata` file that can then be fed back into the compiler via
59+
`-Cprofile-use`:
60+
61+
```bash
62+
# STEP 1: Compile the binary with instrumentation
63+
rustc -Cprofile-generate=/tmp/pgo-data -O ./main.rs
64+
65+
# STEP 2: Run the binary a few times, maybe with common sets of args.
66+
# Each run will create or update `.profraw` files in /tmp/pgo-data
67+
./main mydata1.csv
68+
./main mydata2.csv
69+
./main mydata3.csv
70+
71+
# STEP 3: Merge and post-process all the `.profraw` files in /tmp/pgo-data
72+
llvm-profdata merge -o ./merged.profdata /tmp/pgo-data
73+
74+
# STEP 4: Use the merged `.profdata` file during optimization. All `rustc`
75+
# flags have to be the same.
76+
rustc -Cprofile-use=./merged.profdata -O ./main.rs
77+
```
78+
79+
### A Complete Cargo Workflow
80+
81+
Using this feature with Cargo works very similar to using it with `rustc`
82+
directly. Again, we generate an instrumented binary, run it to produce data,
83+
merge the data, and feed it back into the compiler. Some things of note:
84+
85+
- We use the `RUSTFLAGS` environment variable in order to pass the PGO compiler
86+
flags to the compilation of all crates in the program.
87+
88+
- We pass the `--target` flag to Cargo, which prevents the `RUSTFLAGS`
89+
arguments to be passed to Cargo build scripts. We don't want the build
90+
scripts to generate a bunch of `.profraw` files.
91+
92+
- We pass `--release` to Cargo because that's where PGO makes the most sense.
93+
In theory, PGO can also be done on debug builds but there is little reason
94+
to do so.
95+
96+
- It is recommended to use *absolute paths* for the argument of
97+
`-Cprofile-generate` and `-Cprofile-use`. Cargo can invoke `rustc` with
98+
varying working directories, meaning that `rustc` will not be able to find
99+
the supplied `.profdata` file. With absolute paths this is not an issue.
100+
101+
- It is good practice to make sure that there is no left-over profiling data
102+
from previous compilation sessions. Just deleting the directory is a simple
103+
way of doing so (see `STEP 0` below).
104+
105+
This is what the entire workflow looks like:
106+
107+
```bash
108+
# STEP 0: Make sure there is no left-over profiling data from previous runs
109+
rm -rf /tmp/pgo-data
110+
111+
# STEP 1: Build the instrumented binaries
112+
RUSTFLAGS="-Cprofile-generate=/tmp/pgo-data" \
113+
cargo build --release --target=x86_64-unknown-linux-gnu
114+
115+
# STEP 2: Run the instrumented binaries with some typical data
116+
./target/x86_64-unknown-linux-gnu/release/myprogram mydata1.csv
117+
./target/x86_64-unknown-linux-gnu/release/myprogram mydata2.csv
118+
./target/x86_64-unknown-linux-gnu/release/myprogram mydata3.csv
119+
120+
# STEP 3: Merge the `.profraw` files into a `.profdata` file
121+
llvm-profdata merge -o /tmp/pgo-data/merged.profdata /tmp/pgo-data
122+
123+
# STEP 4: Use the `.profdata` file for guiding optimizations
124+
RUSTFLAGS="-Cprofile-use=/tmp/pgo-data/merged.profdata" \
125+
cargo build --release --target=x86_64-unknown-linux-gnu
126+
```
127+
128+
## Further Reading
129+
130+
`rustc`'s PGO support relies entirely on LLVM's implementation of the feature
131+
and is equivalent to what Clang offers via the `-fprofile-generate` /
132+
`-fprofile-use` flags. The [Profile Guided Optimization][clang-pgo] section
133+
in Clang's documentation is therefore an interesting read for anyone who wants
134+
to use PGO with Rust.
135+
136+
[clang-pgo]: https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization

0 commit comments

Comments
 (0)