Skip to content

Commit 27e6f17

Browse files
authored
Merge pull request #22775 from JuliaLang/teh/docs_inlining
Add NEWS and docs for the new inlining algorithm
2 parents 7c31c41 + 70dfb6d commit 27e6f17

File tree

3 files changed

+115
-0
lines changed

3 files changed

+115
-0
lines changed

NEWS.md

+7
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,11 @@ Library improvements
103103
Compiler/Runtime improvements
104104
-----------------------------
105105

106+
* The inlining heuristic now models the approximate runtime cost of
107+
a method (using some strongly-simplifying assumptions). Functions
108+
are inlined unless their estimated runtime cost substantially
109+
exceeds the cost of setting up and issuing a subroutine
110+
call. ([#22210], [#22732])
106111

107112
Deprecated or removed
108113
---------------------
@@ -968,8 +973,10 @@ Command-line option changes
968973
[#22182]: https://github.com/JuliaLang/julia/issues/22182
969974
[#22187]: https://github.com/JuliaLang/julia/issues/22187
970975
[#22188]: https://github.com/JuliaLang/julia/issues/22188
976+
[#22210]: https://github.com/JuliaLang/julia/issues/22210
971977
[#22224]: https://github.com/JuliaLang/julia/issues/22224
972978
[#22228]: https://github.com/JuliaLang/julia/issues/22228
973979
[#22245]: https://github.com/JuliaLang/julia/issues/22245
974980
[#22310]: https://github.com/JuliaLang/julia/issues/22310
975981
[#22523]: https://github.com/JuliaLang/julia/issues/22523
982+
[#22732]: https://github.com/JuliaLang/julia/issues/22732

doc/src/devdocs/inference.md

+107
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
2+
# Inference
3+
4+
## How inference works
5+
6+
[Type inference](https://en.wikipedia.org/wiki/Type_inference) refers
7+
to the process of deducing the types of later values from the types of
8+
input values. Julia's approach to inference has been described in blog
9+
posts
10+
([1](https://juliacomputing.com/blog/2016/04/04/inference-convergence.html),
11+
[2](https://juliacomputing.com/blog/2017/05/15/inference-converage2.html)).
12+
13+
## Debugging inference.jl
14+
15+
You can start a Julia session, edit `inference.jl` (for example to
16+
insert `print` statements), and then replace `Core.Inference` in your
17+
running session by navigating to `base/` and executing
18+
`include("coreimg.jl")`. This trick typically leads to much faster
19+
development than if you rebuild Julia for each change.
20+
21+
A convenient entry point into inference is `typeinf_code`. Here's a
22+
demo running inference on `convert(Int, UInt(1))`:
23+
24+
```julia
25+
# Get the method
26+
atypes = Tuple{Type{Int}, UInt} # argument types
27+
mths = methods(convert, atypes) # worth checking that there is only one
28+
m = first(mths)
29+
30+
# Create variables needed to call `typeinf_code`
31+
params = Core.Inference.InferenceParams(typemax(UInt)) # parameter is the world age,
32+
# typemax(UInt) -> most recent
33+
sparams = Core.svec() # this particular method doesn't have type-parameters
34+
optimize = true # run all inference optimizations
35+
cached = false # force inference to happen (do not use cached results)
36+
Core.Inference.typeinf_code(m, atypes, sparams, optimize, cached, params)
37+
```
38+
39+
If your debugging adventures require a `MethodInstance`, you can look it up by
40+
calling `Core.Inference.code_for_method` using many of the variables above.
41+
A `CodeInfo` object may be obtained with
42+
```julia
43+
# Returns the CodeInfo object for `convert(Int, ::UInt)`:
44+
ci = (@code_typed convert(Int, UInt(1)))[1]
45+
```
46+
47+
## The inlining algorithm (inline_worthy)
48+
49+
Much of the hardest work for inlining runs in
50+
`inlining_pass`. However, if your question is "why didn't my function
51+
inline?" then you will most likely be interested in `isinlineable` and
52+
its primary callee, `inline_worthy`. `isinlineable` handles a number
53+
of special cases (e.g., critical functions like `next` and `done`,
54+
incorporating a bonus for functions that return tuples, etc.). The
55+
main decision-making happens in `inline_worthy`, which returns `true`
56+
if the function should be inlined.
57+
58+
`inline_worthy` implements a cost-model, where "cheap" functions get
59+
inlined; more specifically, we inline functions if their anticipated
60+
run-time is not large compared to the time it would take to
61+
[issue a call](https://en.wikipedia.org/wiki/Calling_convention) to
62+
them if they were not inlined. The cost-model is extremely simple and
63+
ignores many important details: for example, all `for` loops are
64+
analyzed as if they will be executed once, and the cost of an
65+
`if...else...end` includes the summed cost of all branches. It's also
66+
worth acknowledging that we currently lack a suite of functions
67+
suitable for testing how well the cost model predicts the actual
68+
run-time cost, although
69+
[BaseBenchmarks](https://github.com/JuliaCI/BaseBenchmarks.jl)
70+
provides a great deal of indirect information about the successes and
71+
failures of any modification to the inlining algorithm.
72+
73+
The foundation of the cost-model is a lookup table, implemented in
74+
`add_tfunc` and its callers, that assigns an estimated cost (measured
75+
in CPU cycles) to each of Julia's intrinsic functions. These costs are
76+
based on
77+
[standard ranges for common architectures](http://ithare.com/wp-content/uploads/part101_infographics_v08.png)
78+
(see
79+
[Agner Fog's analysis](http://www.agner.org/optimize/instruction_tables.pdf)
80+
for more detail).
81+
82+
We supplement this low-level lookup table with a number of special
83+
cases. For example, an `:invoke` expression (a call for which all
84+
input and output types were inferred in advance) is assigned a fixed
85+
cost (currently 20 cycles). In contrast, a `:call` expression, for
86+
functions other than intrinsics/builtins, indicates that the call will
87+
require dynamic dispatch, in which case we assign a cost set by
88+
`InferenceParams.inline_nonleaf_penalty` (currently set at 1000). Note
89+
that this is not a "first-principles" estimate of the raw cost of
90+
dynamic dispatch, but a mere heuristic indicating that dynamic
91+
dispatch is extremely expensive.
92+
93+
Each statement gets analyzed for its total cost in a function called
94+
`statement_cost`. You can run this yourself by following this example:
95+
96+
```julia
97+
params = Core.Inference.InferenceParams(typemax(UInt))
98+
# Get the CodeInfo object
99+
ci = (@code_typed fill(3, (5, 5)))[1] # we'll try this on the code for `fill(3, (5, 5))`
100+
# Calculate cost of each statement
101+
cost(stmt) = Core.Inference.statement_cost(stmt, ci, Base, params)
102+
cst = map(cost, ci.code)
103+
```
104+
105+
The output is a `Vector{Int}` holding the estimated cost of each
106+
statement in `ci.code`. Note that `ci` includes the consequences of
107+
inlining callees, and consequently the costs do too.

doc/src/index.md

+1
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@
9090
* [Arrays with custom indices](@ref)
9191
* [Base.LibGit2](@ref)
9292
* [Module loading](@ref)
93+
* [Inference](@ref)
9394
* Developing/debugging Julia's C code
9495
* [Reporting and analyzing crashes (segfaults)](@ref)
9596
* [gdb debugging tips](@ref)

0 commit comments

Comments
 (0)