Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method call reference: major rewrite #1725

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

adetaylor
Copy link
Contributor

@adetaylor adetaylor commented Jan 30, 2025

This section of the reference has been oversimplistic for some time (#1018 and #1534) and various rewrites have been attempted (e.g. #1394, #1432). Here's another attempt!

My approach here is:

  • Stop trying to keep it short and concise
  • Document what actually happens in the code, step by step

This does result in a long explanation, because we're trying to document nearly 2400 lines of code in probe.rs, but doing otherwise feels as though we'll continue to run into criticisms of oversimplification.

This rewrite documents the post-arbitrary-self-types v2 situation, i.e. it assumes rust-lang/rust#135881 has landed. We should not merge this until or unless that lands.

This PR was inspired by discussion in #1699. If we go ahead with this approach, #1699 becomes irrelevant. There was also discussion at rust-lang/cargo#15117 (comment)

Tracking:

This section of the reference has been oversimplistic for some time (rust-lang#1018
and rust-lang#1534) and various rewrites have been attempted (e.g. rust-lang#1394, rust-lang#1432).
Here's another attempt!

My approach here is:
* Stop trying to keep it short and concise
* Document what actually happens in the code, step by step

This does result in a long explanation, because we're trying to document
nearly 2400 lines of code in `probe.rs`, but doing otherwise feels
as though we'll continue to run into criticisms of oversimplification.

This rewrite documents the post-arbitrary-self-types v2 situation,
i.e. it assumes rust-lang/rust#135881 has
landed. We should not merge this until or unless that lands.

This PR was inspired by discussion in rust-lang#1699. If we go ahead with this
approach, rust-lang#1699 becomes irrelevant. There was also discussion at
rust-lang/cargo#15117 (comment)
@rustbot rustbot added the S-waiting-on-review Status: The marked PR is awaiting review from a maintainer label Jan 30, 2025
@nikomatsakis nikomatsakis self-assigned this Jan 30, 2025
@adetaylor adetaylor marked this pull request as ready for review January 30, 2025 18:58
Copy link
Contributor

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully agree that it's better to list this out in detail than trying to condense it down to unreadable, and what you've written here is phenomenal!

Comment on lines +108 to +111
* And finally, a method with a `Pin` that's reborrowed, if the `pin_ergonomics`
feature is enabled.
* First for inherent methods
* Then for extension methods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't generally document unstable features, but this is the kind of case that makes me pretty sad about taking it out, as the comprehensiveness here is nice. Wish we had a way to conditionalize this somehow.

cc @nikomatsakis as an example for us to think about.

@traviscross
Copy link
Contributor

Overall, this is fantastic. Very much an improvement. Thanks @adetaylor.

@traviscross traviscross added the S-waiting-on-stabilization Waiting for a stabilization PR to be merged in the main Rust repository label Jan 31, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jan 31, 2025

☔ The latest upstream changes (possibly 4249fb4) made this pull request unmergeable. Please resolve the merge conflicts.

@mattheww
Copy link
Contributor

mattheww commented Feb 3, 2025

Note there's another description of the method lookup process in the
rustc-dev-guide.

@mattheww
Copy link
Contributor

mattheww commented Feb 3, 2025

One thing that makes this business difficult to document is that method lookup happens when type inference is not yet complete (and the results of method lookup participate in type inference).

All the writeups I've seen gloss over this by writing "the receiver type" and then proceeding to describe what happens as if that type was fully determined.

I'm not sure how much detail it makes sense to go into on this page as long as the Reference has very little to say in general about type inference. But I think it might be worth mentioning that:

  • At this point the receiver type might be something like Vec<?>, which I think explains how the "If there are multiple candidates from traits, they may in fact be identical" case can happen.

  • If the type isn't constrained at all by the code coming before the method call, method resolution will just give up (even if there's only one possible type for which the method call could work).

@mattheww
Copy link
Contributor

mattheww commented Feb 3, 2025

It might also be worth noting explicitly that the types of parameters in the method call expression aren't taken into account in method resolution. Even the number of parameters is ignored. So the following won't compile, because the method from NoParameter is found earlier in the picking process even though it's "obviously" the wrong one.

trait NoParameter {
    fn method(self);
}

trait OneParameter {
    fn method(&self, jj: i32);
}

impl NoParameter for char {
    fn method(self) {}
}

impl OneParameter for char {
    fn method(&self, jj: i32) {}
}

fn f() {
    'x'.method(123);
}

@obi1kenobi
Copy link
Member

Wow, this example is very surprising to me!

With my "cargo-semver-checks maintainer" hat on, it's extremely useful to have someplace where these details are written up with examples like this. The better I can understand these nuances, the better our rules for discovering breakage will be.

I will leave to you all the question of whether the reference or some other location is the best place to document these behaviors.

@adetaylor
Copy link
Contributor Author

It might also be worth noting explicitly that the types of parameters in the method call expression aren't taken into account in method resolution.

Agreed - I added your example in 687a52b.

@adetaylor
Copy link
Contributor Author

@mattheww regarding

One thing that makes this business difficult to document is that method lookup happens when type inference is not yet complete (and the results of method lookup participate in type inference).

I agree that it would be good to explain that here, but I'd prefer it be done in a separate PR - this one is already ambitious enough, and I don't think it makes that situation worse.

@mattheww
Copy link
Contributor

mattheww commented Feb 8, 2025

Yes, it would be better to get this merged and think about improving it later.

I think with the Reference in its current state there's an argument that this part should just be omitted (if it can't happen in a model where the full type of the receiver is known):

If there are multiple candidates from traits, they may in fact be identical, and the picking operation collapses them to a single pick to avoid reporting conflicts

But I'd certainly have no objection to leaving that question for another time.

@adetaylor
Copy link
Contributor Author

I'm not sure whether that's the only case where the candidates may turn out to be identical - I'll try to work it out.

Comment on lines +79 to +82
["Inherent"][inherent] means a candidate method from a block directly
corresponding to the type in the signature. For example, the `impl` blocks
corresponding to a struct or a trait. "Extension" means a candidate gathered
by considering [methods on traits] in scope.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a fun set of differential examples about what constitutes an "inherent candidate" or not:

fn g1(x: u8) -> u8 where u8: m::A {
    x.f() // By extension candidate.
}

fn g2<T: Id<Ty = u8> + Id<Ty: m::A>>(x: T::Ty) -> u8 {
    let _: u8 = x;
    x.f() // By extension candidate.
}

fn g3<T: Id<Ty: m::A>>(x: T::Ty) -> u8 {
    x.f() // By extension candidate.
}

fn g4<T: Id<Ty = U>, U: m::A>(x: T::Ty) -> u8 {
    x.f() // By inherent candidate.
}

fn g5<T: Id<Ty = U>, U>(x: T::Ty) -> u8 where <T as Id>::Ty: m::A {
    x.f() // By inherent candidate.
}

Playground link

Seemingly, it only counts methods from trait impls in bounds as inherent candidates if the type has unified with a generic parameter. The bound doesn't need to be directly on that generic parameter though. This is an interesting way that adding a seemingly-redundant generic parameter can actually affect behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are indeed interesting and quite surprising.

If it helps, this is the code which is presumably having this effect - I'm afraid most of the stuff around relating, normalizing, and transforming types is beyond me, but perhaps it helps you figure out what's up.

Might I propose that I don't alter this PR to account for this complexity, and that as our understanding here is currently low and hopefully will increase, it might be appopriate for a follow-up change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I referenced probe.rs in crafting the examples. I think it's pretty clear at this point what it's doing. We just need to tweak the wording a bit to capture it. Methods from inherent impls are always inherent candidates. For dyn Trait types, this includes the inherent impls on the dyn Trait and the "built-in impl" for vtable dispatch. If the receiver type unifies with a generic parameter, and there's a trait bound on that type, then the methods from that trait are inherent candidates.

Methods from in-scope traits are extension candidates, including when the trait is in scope due to being the trait being implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'd like to take your wording as-is but first I need to work out whether the call to self.n() here is inherent or extension. If it's inherent (I think so) then it's a sufficiently common case we should probably explicitly mention it, even though I suspect that behind the scenes it's handled by "if the receiver type unifies with a generic parameter".

Copy link
Contributor

@traviscross traviscross Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good example. Yes, it's inherent. The proof is this:

#[allow(unused)]
trait Tr {
    fn g(&self) -> u8 { 2 }
}
impl<T: ?Sized> Tr for T {}

trait A {
    fn f(&self) -> u8 {
        self.g()
    }
    fn g(&self) -> u8 { 1 }
}
impl A for () {}

fn main() {
    assert_eq!(().f(), 1);
}

Playground link

It is probably worth mentioning this one too.

Here's that example desugared with specialization:

Playground link

As annotated there, we can see there must be a default Self: A bound on the specialization impl, which agreeing with your theory, would explain why this is treated as an inherent candidate.

@adetaylor
Copy link
Contributor Author

@mattheww says:

Yes, it would be better to get this merged and think about improving it later.

I think with the Reference in its current state there's an argument that this part should just be omitted (if it can't happen in a model where the full type of the receiver is known):

If there are multiple candidates from traits, they may in fact be identical, and the picking operation collapses them to a single pick to avoid reporting conflicts

But I'd certainly have no objection to leaving that question for another time.

I didn't want to decide to keep or remove this bullet until I was certain whether this "pick collapsing" effect only occurs in situations where the full type of the receiver isn't known. So, I tried disabling this bit of code to find out what broke. It is used in other circumstances including some not even involving a receiver, e.g. the I::From call in this example seems to hit it. In cases with a receiver I'm still not sure if the "full type of receiver isn't known" case is the only one. But in any case, I would definitely like to avoid making things more complex and leave any additional tweaks here for a future change. I personally don't feel this bullet does any harm, since it's buried in a list of "here be dragons" further nuances, so I'm going to keep it unless you strongly want it removed. Follow up edit PRs would be great once this lands, of course.

Comment on lines +96 to +97
Once again, the candidate types are iterated. This time, only those types
are iterated which can be reached via the `Deref` trait or built-in derefs;
Copy link
Contributor

@traviscross traviscross Mar 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, here's a question. Above, we create a list of candidate types. We then look for candidate methods in the last step. But notably, we collect candidate methods from extension candidates without regard to the candidate types:

After these occur, there's a further search for extension candidates from all traits in scope (irrespective of whether the trait is remotely relevant to the self type - that's considered later).

Then we get to here, in the picking process. But this text now says that we're only considering those candidate types that can be reached via the Deref chain.

So as written, this would seem to imply that the Receiver chain extending beyond the Deref chain isn't considered at all with regard to selecting or picking methods from in scope traits, but that doesn't seem right.

What's the right answer here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a perspective I hadn't thought of before: the Receiver trait makes no difference to method resolution for extension methods. I think that's the logically correct conclusion.

However, the Receiver trait is considered also in wfcheck.rs which rejects examples like this if you remove the impl Receiver before we even get as far as resolving any methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wild. OK, now this is snapping into focus.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, that then gives us something to explain in the text. We say:

For each of those searches, if exactly one candidate is identified, it's picked, and the search stops. If this results in multiple possible candidates, then it is an error...

There must, I'm guessing, be some interaction with the well-formedness checking at this point. I.e., maybe it's that if exactly one candidate is identified that is well-formed (where the well-formedness checking considers the Receiver chain), then it is picked. Because we don't report ambiguity errors just because two in-scope traits have a method of the same name that could apply. I'm thinking of examples like this:

Playground link


* For a struct, enum, foreign type, or various simpler types (listed below)
there is a search for inherent impl candidates for the type.
* For a type param, there's a search for inherent candidates on the param.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is currently confusing to me, it's also confusing in the source :3

what's happening here is:

  • assemble_inherent_candidates for params looks for where-bounds with that param as the self type. Giving us a list of traits methods
  • all of these trait methods should also be found by assemble_extension_candidates_for_all_traits

However, we manually look for methods from T: Trait bounds to give them precedence over methods from other traits:

trait Trait {
    fn method(&self) {}
}
impl<T> Trait for T {}
trait OtherTrait {
    fn method(&self) {}
}
impl<T> OtherTrait for T {}

struct Wrapper<T>(T);
fn foo<T>(x: T, y: Wrapper<T>)
where
    T: Trait,
    Wrapper<T>: Trait,
{
    x.method(); // ok
    y.method(); // error
}

The same in the "trait object" section.

I personally would prefer to only talk about 'inherent' and 'trait' candidates and then mention at the end that some trait candidates are given the same precedence as inherent ones. But that feels like a fairly big change (and diverges from the current terminology used in rustc)

there's already a convo with @traviscross about this, i think it's fine and idk how to meaningfully change this to be clearer for me 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: The marked PR is awaiting review from a maintainer S-waiting-on-stabilization Waiting for a stabilization PR to be merged in the main Rust repository
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants