Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move from {{closure}}#0 syntax to {closure#0} for (def) path components. #70334

Open
eddyb opened this issue Mar 23, 2020 · 5 comments
Open

Move from {{closure}}#0 syntax to {closure#0} for (def) path components. #70334

eddyb opened this issue Mar 23, 2020 · 5 comments
Labels
A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@eddyb
Copy link
Member

eddyb commented Mar 23, 2020

Today if a path refers to unnamed defs such as closures, it's printed using double braces e.g. for the following example, rustc refers to Foo as main::{{closure}}#0::Foo:

fn main() {
    || {
        struct Foo;
        let () = Foo;
        //  ^^ expected struct `main::{{closure}}#0::Foo`, found `()`
    }
}

However, the new Rust Symbol Mangling (rust-lang/rfcs#2603 / #60705), uses {closure#0} instead.

I forget if I wrote about this before, but I prefer the newer form and it would be nice if we could switch everything uniformly to it.

As an aside, I've long believed that the double braces (e.g. {{closure}}, {{impl}}) were meant to be single braces (i.e. {closure}, {impl}), escaped for format!, and that somehow that got lost.

@oli-obk oli-obk added A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-cleanup Category: PRs that clean code up or issues documenting cleanup. labels Mar 23, 2020
@Centril Centril added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 23, 2020
@bjorn3
Copy link
Member

bjorn3 commented Mar 24, 2020

The {{closure}} name seems to be introduced in 862911d, which uses InternedString::new("{{closure}}"). Before that commit it was InternedString::new("<closure>"). This doesn't seem like being escaped for format!.

@marmeladema
Copy link
Contributor

Hello @eddyb and @bjorn3

I am trying to give this issue a shot. From what I understand, the {{closure}} name is coming from the symbols table: https://github.com/rust-lang/rust/blob/master/src/librustc_span/symbol.rs#L292

and is accessed through DefPathData::as_symbol: https://github.com/rust-lang/rust/blob/master/src/librustc_hir/definitions.rs#L446

In order to generate a proper name complying with the Rust Symbol Mangling: {closure#0}, I would need to replace this as_symbol method either with a new function that returns a String or a function that writes into some std::fmt::Write because the string is dynamic depending on the disambiguator. I tried the latter but as_symbol is actually used in various places, up to the old symbol mangler.

I have a few questions:

  • Should i change the whole compiler to use the new mangling scheme or should some part keep the old {{closure}} behavior? If so which part should be modified and which not?
  • Some places are ignoring the disambiguator if its 0, according to the Rust Mangling Scheme, is this OK? Or should the disambiguator always be specified? The RFC is not clear about this.
  • Some places also generate a different version using a [N] suffix instead of #N. Should those places be updated as well?
  • Does it applies to other symbol like {{impl}} => {impl#N}, or {{opaque}} => {opaque#0} etc?
  • Finally, the RFC at https://github.com/rust-lang/rfcs/blob/master/text/2603-rust-symbol-name-mangling-v0.md specifies this for closure:

The suggested demangling for closures is thus {closure}[<index>]

Where is the {closure#<index>} syntax defined / described? Should the RFC be updated? My personal opinion is that the version described in RFC is easier to deal with since the disambiguator is treated separately from the symbol it is associated with whereas {closure#N} forces us to handle some prefix { and suffix }

@eddyb
Copy link
Member Author

eddyb commented Aug 28, 2020

@marmeladema Apologies for the delay! Please PM me on Zulip or elsewhere if I don't see something on GitHub (I'm still months behind on most notifications).

I would need to replace this as_symbol method either with a new function that returns a String or a function that writes into some std::fmt::Write because the string is dynamic depending on the disambiguator

I think this is overkill, we can "just" return a two-variant enum and force the callers to handle the two cases separately.
I'm imagining something like this (renaming as_symbol, for clarity, though not extremely happy with these names):

enum DefPathDataName {
    Named(Symbol),
    Anon { namespace: Symbol },
}
impl DefPathData {
    fn name(&self) -> DefPathDataName {
        match self {
            DefPathData::ValueNs(name) | ... => DefPathDataName::Name(name),
            DefPathData::ClosureExpr => DefPathDataName::Anon { namespace: sym::closure },
            ...
        }
    }
}

Used like this:

match data.name() {
    DefPathDataName::Named(name) => write!(f, "{}", name),
    DefPathDataName::Anon { namespace } => write!(f, "{{{}#{}}}", namespace, disambiguator)
}

Should i change the whole compiler to use the new mangling scheme

No, this is gated on upstreaming tool support, see #60705. The legacy mangling scheme can just do whatever, we don't guarantee what it looks like. But we can hardcode the old behavior if e.g. tools need it (so we'd want to land a behavioral change to the legacy mangling separately from the rest, to manage its impact better).

Some places also generate a different version using a [N] suffix instead of #N. Should those places be updated as well?

Probably, yeah. If it's a crate hash then it can stay [...] (maybe it shouldn't?), but IMO other path components benefit from #.

Does it applies to other symbol like {{impl}} => {impl#N}, or {{opaque}} => {opaque#0} etc?

Yeah, I'm referring about the general pattern of {{namespace}} => {namespace#N}, not just closures.

Where is the {closure#<index>} syntax defined / described? Should the RFC be updated?

Weird, I thought we put it into the RFC. This is the reference demangler impl, for the record. IIRC, we ended up with this syntax to mimic C++'s foo::{lambda#0} for its closures, as it's not too strange for Rust, and improves the chances of tools interacting well with it (since they already have to deal with the C++ single braces).
This also answers the #0 question, yes we always print that (but that's the same for {{closure}} - see this playground example).

For the user-facing stuff we might want to (but not for this issue, it should probably be separate) replace the #N disambiguator with @ file.rs:line:column instead (which would neatly fit in the {...}) - this could allow us to replace the odd [closure ...] type syntax with "just" a path (IIRC, @arora-aman has been looking at this for a different reason).

whereas {closure#N} forces us to handle some prefix { and suffix }

This can probably be done cleanly (without matching on strings or anything weird), as per the DefPathDataName suggestion above.

@arora-aman
Copy link
Member

arora-aman commented Aug 29, 2020

  • Some places are ignoring the disambiguator if its 0, according to the Rust Mangling Scheme, is this OK? Or should the disambiguator always be specified? The RFC is not clear about this.

That is probably because of the dis !=0 check, here:

// FIXME(eddyb) this will print e.g. `{{closure}}#3`, but it
// might be nicer to use something else, e.g. `{closure#3}`.
let dis = disambiguated_data.disambiguator;
let print_dis = disambiguated_data.data.get_opt_name().is_none()
|| dis != 0 && self.tcx.sess.verbose();
if print_dis {
write!(self, "#{}", dis)?;
}

@eddyb
Copy link
Member Author

eddyb commented Aug 30, 2020

@arora-aman Hmm, that doesn't apply to the "anonymous" DefPathData cases, like closures, because print_dis is always true when .get_opt_name().is_none().
Also, dis != 0 only turns on printing disambiguators when -Zverbose is active.

So even for this example, we'd only show main::f#1 with -Zverbose (maybe we should always show it?):

fn main() {
    let () = { fn f() {} f };
    let () = { fn f() {} f };
}

You can see the difference here: https://godbolt.org/z/bjjb9r

bors added a commit to rust-lang-ci/rust that referenced this issue Sep 26, 2020
…, r=eddyb

Move from {{closure}}#0 syntax to {closure#0} for (def) path components

Part of rust-lang#70334

I followed the approach described by `@eddyb` and introduced a `DefPathDataName` enum.
To preserve compatibility, in various places, I had to rely on formatting manually by calling `format!("{{{{{}}}}}", namespace)`.

My questions are:
* Do we want to convert for places to use the new naming scheme? Or shall I re-add `DefPathData::as_symbol` but renamed as `DefPathData::as_legacy_symbol` to avoid manually allocating the legacy symbols?
* Do we want to `impl Display for DisambiguatedDefPathData` to avoid manually calling `write!(s, "{{{}#{}}}", namespace, component.disambiguator)`?
* We might also want to improve naming for `DefPathDataName` and `DefPathData::get_name`

r? `@eddyb`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-MIR Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html C-cleanup Category: PRs that clean code up or issues documenting cleanup. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants