-
Notifications
You must be signed in to change notification settings - Fork 4
Cache the results of the monomorphisation call resolver #79
Conversation
To cut a long story short, in order to determine static call targets (where possible) it is necessary to call Instance::resolve(). However, only the monomorphisation stage is permitted to do so. This change caches the results of resolutions so that we can use them during SIR lowering. We also utilise the new CallOperand::Virtual variant for calls which cannot be statically known at compile time. Discussion here: https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Calls.20in.20MIR
@@ -727,6 +734,10 @@ fn visit_fn_use<'tcx>( | |||
ty::Instance::resolve_for_fn_ptr | |||
}; | |||
let instance = resolver(tcx, ty::ParamEnv::reveal_all(), def_id, substs).unwrap(); | |||
|
|||
let mut resolutions = tcx.call_resolution_map.borrow_mut(); | |||
resolutions.as_mut().unwrap().insert((def_id, substs), instance); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct. Doing a cast to an fnptr can give a different result than a direct call, while both defid and substs are the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be correct if the hashmap were keyed (def_id, substs, caller_pos)
where caller_pos
identifies the position of the call terminator in question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but only when you don't codegen generic functions. When codegenning generic functions, you will really need to monomorphize the substs and then use Instance::resolve*
in the codegen backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what I feared. We may have to make some fairly substantial changes to our system in light of this. Thanks again for this discussion.
By the way, between us we could probably implement a missing part of the rustc guide here:
https://rust-lang.github.io/rustc-guide/mir/index.html?highlight=instance#mir-data-types
The main MIR data type is Mir. It contains the data for a single function (along with sub-instances of Mir for "promoted constants", but you can read about those below).
And the link to below
is "to be written".
So it seems that the type parameters become "promoted constants" during monomorphisation.
(I also think the main MIR data type is now Body
not Mir
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it seems that the type parameters become "promoted constants" during monomorphisation.
What do you mean?
(I also think the main MIR data type is now Body not Mir)
Mir
has been renamed to Body
recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean?
The wording in the compiler guide talks in these terms, which surprised me a little:
The main MIR data type is Mir. It contains the data for a single function (along with sub-instances of Mir for "promoted constants"
It's talking about making a generic type parameter into a concrete one, I think? Through the act of promotion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually talking about turning
fn abc() {
let a: &'static u8 = 1 + 2;
}
into
fn abc() {
static PROMOTED0: u8 = 1 + 2;
let a: &'static u8 = &PROMOTED0;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. We could probably improve the wording there too.
@@ -1107,6 +1107,10 @@ pub struct GlobalCtxt<'tcx> { | |||
layout_interner: ShardedHashMap<&'tcx LayoutDetails, ()>, | |||
|
|||
output_filenames: Arc<OutputFilenames>, | |||
|
|||
/// Caches the results of `Instance::resolve()` so that Yorick's SIR lowering can use them | |||
/// later. Normally only the monomorphisation collector can resolve instances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not true. The backend can call Instance::resolve*
.
Getting the function call code correct was quite a challenge for me. You have to take many things into account: direct calls, trait object calls, fnptr calls, the rust-call abi for closures, ... The implementation for cg_clif can be found at https://github.com/bjorn3/rustc_codegen_cranelift/blob/f0bb30f8a1ac6107555dfc8fe830d42f469af7f8/src/abi/mod.rs#L349 |
Ouch. Looks like I need to go back to the drawing board. I've found the whole thing very confusing to be honest. There's little documentation on this aspect of the compiler and it's easy to get lost. Thanks for the comments. I'll take a look at your code. |
I'm looking at the code for cranelift and librustc_codegen_llvm and thinking: a) I'm going to end up copying large chunks of functionality from a codegen. @bjorn3 do you know any way that we can make our IR without duplicating codegen logic? I wonder if it would be better for us to merge our serialisation into the codegen itself? Then we could use code-gen decisions as they arise, rather than trying to second guess what they will do. I must admit, I'm quite surprised that resolving call targets is codegen-sepcific. I didn't see that coming, but then again there's probably a good reason for it. |
You could implement https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/builder/trait.BuilderMethods.html and then call https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_instance.html.
Trait object calls are special. First you need to get the vtable from the first argument (a fat pointer), then you need to get the function pointer at the correct offset and finally you need to turn the fat pointer into a thin pointer. Those things are not representable in MIR and may require monomorphization first I think. |
Interesting. If I went that route, I'd have to implement lots and lots of methods, right? Many of the underlying trait methods don't have default impls. It might be easier to modify the existing LLVM codegen, inserting calls to our IR generation at strategic positions.
Can you expand on this a little? Are any of the LLVM assumptions likely to affect us? |
You could use
cg_ssa for example assumes that |
I'm going to close this because we need to go a different route. We will try generating SIR as a nested code-gen. Many thanks to @bjorn3 for sharing his expertise and for his patience with my questions :) |
To cut a long story short, in order to determine static call targets
(where possible) it is necessary to call Instance::resolve(). However,
only the monomorphisation stage is permitted to do so.
This change caches the results of resolutions so that we can use them
during SIR lowering.
We also utilise the new
CallOperand::Virtual
variant for calls whichcannot be statically known at compile time.
Discussion here:
https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Calls.20in.20MIR
Companion to ykjit/yk#40