|
| 1 | + |
| 2 | +# Inference |
| 3 | + |
| 4 | +## How inference works |
| 5 | + |
| 6 | +[Type inference](https://en.wikipedia.org/wiki/Type_inference) refers |
| 7 | +to the process of deducing the types of later values from the types of |
| 8 | +input values. Julia's approach to inference has been described in blog |
| 9 | +posts |
| 10 | +([1](https://juliacomputing.com/blog/2016/04/04/inference-convergence.html), |
| 11 | +[2](https://juliacomputing.com/blog/2017/05/15/inference-converage2.html)). |
| 12 | + |
| 13 | +## Debugging inference.jl |
| 14 | + |
| 15 | +You can start a Julia session, edit `inference.jl` (for example to |
| 16 | +insert `print` statements), and then replace `Core.Inference` in your |
| 17 | +running session by navigating to `base/` and executing |
| 18 | +`include("coreimg.jl")`. This trick typically leads to much faster |
| 19 | +development than if you rebuild Julia for each change. |
| 20 | + |
| 21 | +A convenient entry point into inference is `typeinf_code`. Here's a |
| 22 | +demo running inference on `convert(Int, UInt(1))`: |
| 23 | + |
| 24 | +```julia |
| 25 | +# Get the method |
| 26 | +atypes = Tuple{Type{Int}, UInt} # argument types |
| 27 | +mths = methods(convert, atypes) # worth checking that there is only one |
| 28 | +m = first(mths) |
| 29 | + |
| 30 | +# Create variables needed to call `typeinf_code` |
| 31 | +params = Core.Inference.InferenceParams(typemax(UInt)) # parameter is the world age, |
| 32 | + # typemax(UInt) -> most recent |
| 33 | +sparams = Core.svec() # this particular method doesn't have type-parameters |
| 34 | +optimize = true # run all inference optimizations |
| 35 | +cached = false # force inference to happen (do not use cached results) |
| 36 | +Core.Inference.typeinf_code(m, atypes, sparams, optimize, cached, params) |
| 37 | +``` |
| 38 | + |
| 39 | +If your debugging adventures require a `MethodInstance`, you can look it up by |
| 40 | +calling `Core.Inference.code_for_method` using many of the variables above. |
| 41 | +A `CodeInfo` object may be obtained with |
| 42 | +```julia |
| 43 | +# Returns the CodeInfo object for `convert(Int, ::UInt)`: |
| 44 | +ci = (@code_typed convert(Int, UInt(1)))[1] |
| 45 | +``` |
| 46 | + |
| 47 | +## The inlining algorithm (inline_worthy) |
| 48 | + |
| 49 | +Much of the hardest work for inlining runs in |
| 50 | +`inlining_pass`. However, if your question is "why didn't my function |
| 51 | +inline?" then you will most likely be interested in `isinlineable` and |
| 52 | +its primary callee, `inline_worthy`. `isinlineable` handles a number |
| 53 | +of special cases (e.g., critical functions like `next` and `done`, |
| 54 | +incorporating a bonus for functions that return tuples, etc.). The |
| 55 | +main decision-making happens in `inline_worthy`, which returns `true` |
| 56 | +if the function should be inlined. |
| 57 | + |
| 58 | +`inline_worthy` implements a cost-model, where "cheap" functions get |
| 59 | +inlined; more specifically, we inline functions if their anticipated |
| 60 | +run-time is not large compared to the time it would take to |
| 61 | +[issue a call](https://en.wikipedia.org/wiki/Calling_convention) to |
| 62 | +them if they were not inlined. The cost-model is extremely simple and |
| 63 | +ignores many important details: for example, all `for` loops are |
| 64 | +analyzed as if they will be executed once, and the cost of an |
| 65 | +`if...else...end` includes the summed cost of all branches. It's also |
| 66 | +worth acknowledging that we currently lack a suite of functions |
| 67 | +suitable for testing how well the cost model predicts the actual |
| 68 | +run-time cost, although |
| 69 | +[BaseBenchmarks](https://github.com/JuliaCI/BaseBenchmarks.jl) |
| 70 | +provides a great deal of indirect information about the successes and |
| 71 | +failures of any modification to the inlining algorithm. |
| 72 | + |
| 73 | +The foundation of the cost-model is a lookup table, implemented in |
| 74 | +`add_tfunc` and its callers, that assigns an estimated cost (measured |
| 75 | +in CPU cycles) to each of Julia's intrinsic functions. These costs are |
| 76 | +based on |
| 77 | +[standard ranges for common architectures](http://ithare.com/wp-content/uploads/part101_infographics_v08.png) |
| 78 | +(see |
| 79 | +[Agner Fog's analysis](http://www.agner.org/optimize/instruction_tables.pdf) |
| 80 | +for more detail). |
| 81 | + |
| 82 | +We supplement this low-level lookup table with a number of special |
| 83 | +cases. For example, an `:invoke` expression (a call for which all |
| 84 | +input and output types were inferred in advance) is assigned a fixed |
| 85 | +cost (currently 20 cycles). In contrast, a `:call` expression, for |
| 86 | +functions other than intrinsics/builtins, indicates that the call will |
| 87 | +require dynamic dispatch, in which case we assign a cost set by |
| 88 | +`InferenceParams.inline_nonleaf_penalty` (currently set at 1000). Note |
| 89 | +that this is not a "first-principles" estimate of the raw cost of |
| 90 | +dynamic dispatch, but a mere heuristic indicating that dynamic |
| 91 | +dispatch is extremely expensive. |
| 92 | + |
| 93 | +Each statement gets analyzed for its total cost in a function called |
| 94 | +`statement_cost`. You can run this yourself by following this example: |
| 95 | + |
| 96 | +```julia |
| 97 | +params = Core.Inference.InferenceParams(typemax(UInt)) |
| 98 | +# Get the CodeInfo object |
| 99 | +ci = (@code_typed fill(3, (5, 5)))[1] # we'll try this on the code for `fill(3, (5, 5))` |
| 100 | +# Calculate cost of each statement |
| 101 | +cost(stmt) = Core.Inference.statement_cost(stmt, ci, Base, params) |
| 102 | +cst = map(cost, ci.code) |
| 103 | +``` |
| 104 | + |
| 105 | +The output is a `Vector{Int}` holding the estimated cost of each |
| 106 | +statement in `ci.code`. Note that `ci` includes the consequences of |
| 107 | +inlining callees, and consequently the costs do too. |
0 commit comments