Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method call reference: major rewrite #1725

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 178 additions & 15 deletions src/expressions/method-call-expr.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,22 +15,165 @@ let log_pi = pi.unwrap_or(1.0).log(2.72);

When looking up a method call, the receiver may be automatically dereferenced or borrowed in order to call a method.
This requires a more complex lookup process than for other functions, since there may be a number of possible methods to call.

The following procedure is used:

The first step is to build a list of candidate receiver types.
Obtain these by repeatedly [dereferencing][dereference] the receiver expression's type, adding each type encountered to the list, then finally attempting an [unsized coercion] at the end, and adding the result type if that is successful.
Then, for each candidate `T`, add `&T` and `&mut T` to the list immediately after `T`.
## Determining candidate types

First, a list of "candidate types" is assembled.

These types are found by taking the receiver type and iterating, following either:

* The built-in [dereference]; or
* `<T as Receiver>::Target`

to the next type. (If a step involved following the `Receiver` target, we also
note whether it would have been reachable by following `<T as
Deref::Target>` - this information is used later).

At the end, an additional candidate step may be added for
an [unsized coercion].

Each step of this chain provides a possible `self` type for methods that
might be called. The list will be used in two different ways:

* To find types that might have methods. This is used in the
"determining candidate methods" step, described below. This considers
the full list.
* To find types to which the receiver can be converted. This is used in the
"picking a method from the candidates" step, also described below - in this
case, we only consider the types reachable via `Deref` or built-in
dereferencing.

There is a built-in implementation of `Receiver` for all `T: Deref`, so
most of the time, every step can be reached through either mechanism.
Sometimes, more types can be reached via the `Receiver` chain, and so
more types will be considered for the former usage than the latter usage.

For instance, if the receiver has type `Box<[i32;2]>`, then the candidate types
will be `Box<[i32;2]>`,`[i32; 2]` (by dereferencing), and `[i32]` (by unsized
coercion).

If `SmartPtr<T>: Receiver<Target=T>`, and the receiver type is `&SmartPtr<Foo>`,
then the candidate types would be `&SmartPtr<Foo>`, `SmartPtr<Foo>` and `Foo`.

## Determining candidate methods

This list of candidate types is then converted to a list of candidate methods.
For each step, the candidate type is used to determine what searches to perform:

* For a struct, enum, foreign type, or various simpler types (listed below)
there is a search for inherent impl candidates for the type.
* For a type param, there's a search for inherent candidates on the param.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is currently confusing to me, it's also confusing in the source :3

what's happening here is:

  • assemble_inherent_candidates for params looks for where-bounds with that param as the self type. Giving us a list of traits methods
  • all of these trait methods should also be found by assemble_extension_candidates_for_all_traits

However, we manually look for methods from T: Trait bounds to give them precedence over methods from other traits:

trait Trait {
    fn method(&self) {}
}
impl<T> Trait for T {}
trait OtherTrait {
    fn method(&self) {}
}
impl<T> OtherTrait for T {}

struct Wrapper<T>(T);
fn foo<T>(x: T, y: Wrapper<T>)
where
    T: Trait,
    Wrapper<T>: Trait,
{
    x.method(); // ok
    y.method(); // error
}

The same in the "trait object" section.

I personally would prefer to only talk about 'inherent' and 'trait' candidates and then mention at the end that some trait candidates are given the same precedence as inherent ones. But that feels like a fairly big change (and diverges from the current terminology used in rustc)

there's already a convo with @traviscross about this, i think it's fine and idk how to meaningfully change this to be clearer for me 🤔

* For a trait object, there is first a search for inherent candidates for
the trait (for example in `impl Trait` blocks), then inherent impl
candidates for the trait object itself (for example found in `impl dyn Trait`
blocks).

After these occur, there's a further search for extension candidates for
traits in scope.

"Various simpler types" currently means bool, char, all numbers, str, array,
slices, raw pointers, references, never and tuple.

["Inherent"][inherent] means a candidate method from a block directly
corresponding to the type in the signature. For example, the `impl` blocks
corresponding to a struct or a trait. "Extension" means a candidate gathered
by considering [methods on traits] in scope.
Comment on lines +80 to +83
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a fun set of differential examples about what constitutes an "inherent candidate" or not:

fn g1(x: u8) -> u8 where u8: m::A {
    x.f() // By extension candidate.
}

fn g2<T: Id<Ty = u8> + Id<Ty: m::A>>(x: T::Ty) -> u8 {
    let _: u8 = x;
    x.f() // By extension candidate.
}

fn g3<T: Id<Ty: m::A>>(x: T::Ty) -> u8 {
    x.f() // By extension candidate.
}

fn g4<T: Id<Ty = U>, U: m::A>(x: T::Ty) -> u8 {
    x.f() // By inherent candidate.
}

fn g5<T: Id<Ty = U>, U>(x: T::Ty) -> u8 where <T as Id>::Ty: m::A {
    x.f() // By inherent candidate.
}

Playground link

Seemingly, it only counts methods from trait impls in bounds as inherent candidates if the type has unified with a generic parameter. The bound doesn't need to be directly on that generic parameter though. This is an interesting way that adding a seemingly-redundant generic parameter can actually affect behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are indeed interesting and quite surprising.

If it helps, this is the code which is presumably having this effect - I'm afraid most of the stuff around relating, normalizing, and transforming types is beyond me, but perhaps it helps you figure out what's up.

Might I propose that I don't alter this PR to account for this complexity, and that as our understanding here is currently low and hopefully will increase, it might be appopriate for a follow-up change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I referenced probe.rs in crafting the examples. I think it's pretty clear at this point what it's doing. We just need to tweak the wording a bit to capture it. Methods from inherent impls are always inherent candidates. For dyn Trait types, this includes the inherent impls on the dyn Trait and the "built-in impl" for vtable dispatch. If the receiver type unifies with a generic parameter, and there's a trait bound on that type, then the methods from that trait are inherent candidates.

Methods from in-scope traits are extension candidates, including when the trait is in scope due to being the trait being implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'd like to take your wording as-is but first I need to work out whether the call to self.n() here is inherent or extension. If it's inherent (I think so) then it's a sufficiently common case we should probably explicitly mention it, even though I suspect that behind the scenes it's handled by "if the receiver type unifies with a generic parameter".

Copy link
Contributor

@traviscross traviscross Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good example. Yes, it's inherent. The proof is this:

#[allow(unused)]
trait Tr {
    fn g(&self) -> u8 { 2 }
}
impl<T: ?Sized> Tr for T {}

trait A {
    fn f(&self) -> u8 {
        self.g()
    }
    fn g(&self) -> u8 { 1 }
}
impl A for () {}

fn main() {
    assert_eq!(().f(), 1);
}

Playground link

It is probably worth mentioning this one too.

Here's that example desugared with specialization:

Playground link

As annotated there, we can see there must be a default Self: A bound on the specialization impl, which agreeing with your theory, would explain why this is treated as an inherent candidate.


These searches contribute to list of all the candidate methods found;
separate lists are maintained for the inherent and extension candidates.
Only [visible] candidates are included.

For instance, if the receiver has type `Box<[i32;2]>`, then the candidate types will be `Box<[i32;2]>`, `&Box<[i32;2]>`, `&mut Box<[i32;2]>`, `[i32; 2]` (by dereferencing), `&[i32; 2]`, `&mut [i32; 2]`, `[i32]` (by unsized coercion), `&[i32]`, and finally `&mut [i32]`.
## Picking a method from the candidates

Then, for each candidate type `T`, search for a [visible] method with a receiver of that type in the following places:
Once the list of candidate methods is assembled, the "picking" process
starts.

1. `T`'s inherent methods (methods implemented directly on `T`).
1. Any of the methods provided by a [visible] trait implemented by `T`.
If `T` is a type parameter, methods provided by trait bounds on `T` are looked up first.
Then all remaining methods in scope are looked up.
Once again, the candidate types are iterated. This time, only those types
are iterated which can be reached via the `Deref` trait or built-in derefs;
Comment on lines +96 to +97
Copy link
Contributor

@traviscross traviscross Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, here's a question. Above, we create a list of candidate types. We then look for candidate methods in the last step. But notably, we collect candidate methods from extension candidates without regard to the candidate types:

After these occur, there's a further search for extension candidates from all traits in scope (irrespective of whether the trait is remotely relevant to the self type - that's considered later).

Then we get to here, in the picking process. But this text now says that we're only considering those candidate types that can be reached via the Deref chain.

So as written, this would seem to imply that the Receiver chain extending beyond the Deref chain isn't considered at all with regard to selecting or picking methods from in scope traits, but that doesn't seem right.

What's the right answer here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a perspective I hadn't thought of before: the Receiver trait makes no difference to method resolution for extension methods. I think that's the logically correct conclusion.

However, the Receiver trait is considered also in wfcheck.rs which rejects examples like this if you remove the impl Receiver before we even get as far as resolving any methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wild. OK, now this is snapping into focus.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that then gives us something to explain in the text. We say:

For each of those searches, if exactly one candidate is identified, it's picked, and the search stops. If this results in multiple possible candidates, then it is an error...

There must, I'm guessing, be some interaction with the well-formedness checking at this point. I.e., maybe it's that if exactly one candidate is identified that is well-formed (where the well-formedness checking considers the Receiver chain), then it is picked. Because we don't report ambiguity errors just because two in-scope traits have a method of the same name that could apply. I'm thinking of examples like this:

Playground link

as noted above, this may be a shorter list than those that can be reached
using the `Receiver` trait.

> Note: the lookup is done for each type in order, which can occasionally lead to surprising results.
For each step, picking is attempted in this order:

* First, a by-value method, where the `self` type precisely matches
* First for inherent methods
* Then for extension methods
* Then, a method where `self` is received by immutable reference (`&T`)
* First for inherent methods
* Then for extension methods
* Then, a method where `self` is received by mutable reference (`&mut T`)
* First for inherent methods
* Then for extension methods
* Then, a method where the `self` type is a `*const T` - this is only considered
if the self type is `*mut T`
* First for inherent methods
* Then for extension methods
* And finally, a method with a `Pin` that's reborrowed, if the `pin_ergonomics`
feature is enabled.
* First for inherent methods
* Then for extension methods
Comment on lines +116 to +119
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally document unstable features, but this is the kind of case that makes me pretty sad about taking it out, as the comprehensiveness here is nice. Wish we had a way to conditionalize this somehow.

cc @nikomatsakis as an example for us to think about.


For each of those searches, if exactly one candidate is identified,
it's picked, and the search stops. If this results in multiple possible candidates,
then it is an error, and the user must [disambiguate][disambiguate call]
the call and convert the receiver to an appropriate receiver type.

With the example above of `SmartPtr<T>: Receiver<Target=T>`, and the receiver
type `&SmartPtr<Foo>`, this mechanism would pick:

```rust,ignore
impl Foo {
fn method(self: &SmartPtr<Foo>) {}
}
```

but would not pick

```rust,ignore
impl Foo {
fn method(self: &Foo) {}
}
```

because the receiver could not be converted to `&Foo` using the `Deref` chain,
only the `Receiver` chain.

## Extra details

There are a few details not considered in this overview:

* The search for candidate methods will search more widely (potentially
across crates) for certain [incoherent] types: that includes any of
the "various simpler types" listed above; and any `dyn`, struct, enum, or
foreign type where the standard-library-internal attribute
`rustc_has_incoherent_inherent_impls` is active.
* If there are multiple candidates from traits, they may in fact be
identical, and the picking operation collapses them to a single pick to avoid
reporting conflicts.
* Extra searches are performed to spot "shadowing" of pointee methods
by smart pointer methods, during the picking process. If a by-value pick
is going to be returned, an extra search is performed for a `&T` or
`&mut T` method. Similarly, if a `&T` method is to be returned, an extra
search is performed for `&mut T` methods. These extra searches consider
only inherent methods, where `T` is identical, but the method is
found from a step further along the `Receiver` chain. If any such method
is found, an ambiguity error is emitted.
* An error is emitted if we reached a recursion limit.
* The picking process emits some adjustments which must be made to the
receiver type in order to get to the correct `self` type. This includes
a number of dereferences, a possible autoreferencing, a conversion from
a mutable pointer to a constant pointer, or a pin reborrow.
* Extra lists are maintained for diagnostic purposes:
unstable candidates, unsatisfied predicates, and static candidates.
* For diagnostic purposes, the search may be performed slightly differently,
for instance searching all traits not just those in scope, or also noting
inaccessible candidates.

## Net results

> The lookup is done for each type in order, which can occasionally lead to surprising results.
> The below code will print "In trait impl!", because `&self` methods are looked up first, the trait method is found before the struct's `&mut self` method is found.
>
> ```rust
Expand Down Expand Up @@ -58,13 +201,30 @@ Then, for each candidate type `T`, search for a [visible] method with a receiver
> }
> ```

If this results in multiple possible candidates, then it is an error, and the receiver must be [converted][disambiguate call] to an appropriate receiver type to make the method call.
The types and number of parameters in the method call expression aren't taken
into account in method resolution. So the following won't compile:

This process does not take into account the mutability or lifetime of the receiver, or whether a method is `unsafe`.
Once a method is looked up, if it can't be called for one (or more) of those reasons, the result is a compiler error.
```rust,nocompile
trait NoParameter {
fn method(self);
}

If a step is reached where there is more than one possible method, such as where generic methods or traits are considered the same, then it is a compiler error.
These cases require a [disambiguating function call syntax] for method and function invocation.
trait OneParameter {
fn method(&self, jj: i32);
}

impl NoParameter for char {
fn method(self) {} // found first and picked, but doesn't work
}

impl OneParameter for char {
fn method(&self, jj: i32) {} // found second, thus ignored
}

fn f() {
'x'.method(123);
}
```

> **Edition differences**: Before the 2021 edition, during the search for visible methods, if the candidate receiver type is an [array type], methods provided by the standard library [`IntoIterator`] trait are ignored.
>
Expand All @@ -91,3 +251,6 @@ These cases require a [disambiguating function call syntax] for method and funct
[methods]: ../items/associated-items.md#methods
[unsized coercion]: ../type-coercions.md#unsized-coercions
[`IntoIterator`]: std::iter::IntoIterator
[inherent]: ../items/implementations.md#inherent-implementations
[methods on traits]: ../items/implementations.md#trait-implementations
[incoherent]: ../items/implementations.md#trait-implementation-coherence