Cache the results of the monomorphisation call resolver #79

vext01 · 2019-12-02T14:26:55Z

To cut a long story short, in order to determine static call targets
(where possible) it is necessary to call Instance::resolve(). However,
only the monomorphisation stage is permitted to do so.

This change caches the results of resolutions so that we can use them
during SIR lowering.

We also utilise the new CallOperand::Virtual variant for calls which
cannot be statically known at compile time.

Discussion here:
https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Calls.20in.20MIR

Companion to ykjit/yk#40

To cut a long story short, in order to determine static call targets (where possible) it is necessary to call Instance::resolve(). However, only the monomorphisation stage is permitted to do so. This change caches the results of resolutions so that we can use them during SIR lowering. We also utilise the new CallOperand::Virtual variant for calls which cannot be statically known at compile time. Discussion here: https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Calls.20in.20MIR

bjorn3 · 2019-12-02T14:33:05Z

src/librustc_mir/monomorphize/collector.rs

@@ -727,6 +734,10 @@ fn visit_fn_use<'tcx>(
            ty::Instance::resolve_for_fn_ptr
        };
        let instance = resolver(tcx, ty::ParamEnv::reveal_all(), def_id, substs).unwrap();
+
+        let mut resolutions = tcx.call_resolution_map.borrow_mut();
+        resolutions.as_mut().unwrap().insert((def_id, substs), instance);


This is not correct. Doing a cast to an fnptr can give a different result than a direct call, while both defid and substs are the same.

Would it be correct if the hashmap were keyed (def_id, substs, caller_pos) where caller_pos identifies the position of the call terminator in question?

Yes, but only when you don't codegen generic functions. When codegenning generic functions, you will really need to monomorphize the substs and then use Instance::resolve* in the codegen backend.

Yes, that's what I feared. We may have to make some fairly substantial changes to our system in light of this. Thanks again for this discussion.

By the way, between us we could probably implement a missing part of the rustc guide here:
https://rust-lang.github.io/rustc-guide/mir/index.html?highlight=instance#mir-data-types

The main MIR data type is Mir. It contains the data for a single function (along with sub-instances of Mir for "promoted constants", but you can read about those below).

And the link to below is "to be written".

So it seems that the type parameters become "promoted constants" during monomorphisation.

(I also think the main MIR data type is now Body not Mir)

So it seems that the type parameters become "promoted constants" during monomorphisation.

What do you mean?

(I also think the main MIR data type is now Body not Mir)

Mir has been renamed to Body recently.

What do you mean?

The wording in the compiler guide talks in these terms, which surprised me a little:

The main MIR data type is Mir. It contains the data for a single function (along with sub-instances of Mir for "promoted constants"

It's talking about making a generic type parameter into a concrete one, I think? Through the act of promotion.

It is actually talking about turning

fn abc() { let a: &'static u8 = 1 + 2; }

into

fn abc() { static PROMOTED0: u8 = 1 + 2; let a: &'static u8 = &PROMOTED0; }

Oh, I see. We could probably improve the wording there too.

bjorn3 · 2019-12-02T14:34:27Z

src/librustc/ty/context.rs

@@ -1107,6 +1107,10 @@ pub struct GlobalCtxt<'tcx> {
    layout_interner: ShardedHashMap<&'tcx LayoutDetails, ()>,

    output_filenames: Arc<OutputFilenames>,
+
+    /// Caches the results of `Instance::resolve()` so that Yorick's SIR lowering can use them
+    /// later. Normally only the monomorphisation collector can resolve instances.


That's not true. The backend can call Instance::resolve*.

bjorn3 · 2019-12-02T14:45:45Z

Getting the function call code correct was quite a challenge for me. You have to take many things into account: direct calls, trait object calls, fnptr calls, the rust-call abi for closures, ...

The implementation for cg_clif can be found at https://github.com/bjorn3/rustc_codegen_cranelift/blob/f0bb30f8a1ac6107555dfc8fe830d42f469af7f8/src/abi/mod.rs#L349

vext01 · 2019-12-02T14:54:10Z

Ouch. Looks like I need to go back to the drawing board.

I've found the whole thing very confusing to be honest. There's little documentation on this aspect of the compiler and it's easy to get lost.

Thanks for the comments. I'll take a look at your code.

vext01 · 2019-12-02T15:23:38Z

I'm looking at the code for cranelift and librustc_codegen_llvm and thinking:

a) I'm going to end up copying large chunks of functionality from a codegen.
b) Keeping in sync with the code-gen is going to be a pain.

@bjorn3 do you know any way that we can make our IR without duplicating codegen logic? I wonder if it would be better for us to merge our serialisation into the codegen itself? Then we could use code-gen decisions as they arise, rather than trying to second guess what they will do.

I must admit, I'm quite surprised that resolving call targets is codegen-sepcific. I didn't see that coming, but then again there's probably a good reason for it.

bjorn3 · 2019-12-02T17:15:57Z

do you know any way that we can make our IR without duplicating codegen logic?

You could implement https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/builder/trait.BuilderMethods.html and then call https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_instance.html. rustc_codegen_ssa still has a lot of llvm based assumptions though. That's why I don't yet use it in cg_clif.

I must admit, I'm quite surprised that resolving call targets is codegen-sepcific. I didn't see that coming, but then again there's probably a good reason for it.

Trait object calls are special. First you need to get the vtable from the first argument (a fat pointer), then you need to get the function pointer at the correct offset and finally you need to turn the fat pointer into a thin pointer. Those things are not representable in MIR and may require monomorphization first I think.

vext01 · 2019-12-03T09:34:40Z

You could implement BuilderMethods and then call codegen_instance.

Interesting.

If I went that route, I'd have to implement lots and lots of methods, right? Many of the underlying trait methods don't have default impls.

It might be easier to modify the existing LLVM codegen, inserting calls to our IR generation at strategic positions.

rustc_codegen_ssa still has a lot of llvm based assumptions though. That's why I don't yet use it in cg_clif.

Can you expand on this a little? Are any of the LLVM assumptions likely to affect us?

bjorn3 · 2019-12-03T10:09:29Z

If I went that route, I'd have to implement lots and lots of methods, right?

You could use unimplemented!() in most cases.

Can you expand on this a little?

cg_ssa for example assumes that Bx::Value can hold a primitive value pair like (0u64, 42u32). It assumes that constant values, like those from promoted mir, can be created outside a function and then used within it just like any other Bx::Value. It passes &self instead of &mut self to the functions creating constant values.

rust-lang/rust#56108

vext01 · 2019-12-06T11:24:52Z

I'm going to close this because we need to go a different route.

We will try generating SIR as a nested code-gen.

Many thanks to @bjorn3 for sharing his expertise and for his patience with my questions :)

vext01 added 2 commits December 2, 2019 14:24

Cycle breaker.

de2c5d9

vext01 assigned ltratt Dec 2, 2019

bjorn3 suggested changes Dec 2, 2019

View reviewed changes

vext01 closed this Dec 6, 2019

vext01 mentioned this pull request Dec 6, 2019

Add a Virtual Variant to our SIR call targets. ykjit/yk#40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache the results of the monomorphisation call resolver #79

Cache the results of the monomorphisation call resolver #79

vext01 commented Dec 2, 2019

bjorn3 Dec 2, 2019

vext01 Dec 4, 2019

bjorn3 Dec 4, 2019 •

edited

Loading

vext01 Dec 5, 2019

bjorn3 Dec 5, 2019

vext01 Dec 5, 2019

bjorn3 Dec 5, 2019

vext01 Dec 5, 2019

bjorn3 Dec 2, 2019

bjorn3 commented Dec 2, 2019

vext01 commented Dec 2, 2019

vext01 commented Dec 2, 2019

bjorn3 commented Dec 2, 2019

vext01 commented Dec 3, 2019

bjorn3 commented Dec 3, 2019

vext01 commented Dec 6, 2019

Cache the results of the monomorphisation call resolver #79

Cache the results of the monomorphisation call resolver #79

Conversation

vext01 commented Dec 2, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjorn3 Dec 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bjorn3 commented Dec 2, 2019

vext01 commented Dec 2, 2019

vext01 commented Dec 2, 2019

bjorn3 commented Dec 2, 2019

vext01 commented Dec 3, 2019

bjorn3 commented Dec 3, 2019

vext01 commented Dec 6, 2019

bjorn3 Dec 4, 2019 •

edited

Loading