Create RFC for "return position enum impl trait"

yoshuawuyts · yoshuawuyts · commit 2ddaa305b295 · 2023-01-05T18:51:33.000+01:00
diff --git a/text/0000-multi-type-return-position-impl-trait.md b/text/0000-multi-type-return-position-impl-trait.md
@@ -0,0 +1,356 @@
+- Feature Name: (fill me in with a unique ident, `multi_type_return_position_impl_trait`)
+- Start Date: (fill me in with today's date, 2023-01-05)
+- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
+- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
+
+# Summary
+[summary]: #summary
+
+This RFC enables [Return Position Impl Trait (RPIT)][RPIT] to work in functions
+which return more than one type. This is achieved by desugaring the return type
+into an enum with members containing each of the returned types, and
+implementing traits which delegate to those members:
+
+[RPIT]: https://doc.rust-lang.org/stable/rust-by-example/trait/impl_trait.html#as-a-return-type
+
+```rust
+// Possible already
+fn single_iter() -> impl Iterator<Item = i32> {
+    1..10 // `std::ops::Range<i32>`
+}
+
+// Enabled by this RFC
+fn multi_iter(x: i32) -> impl Iterator<Item = i32> {
+    match x {
+        0 => 1..10,                   // `std::ops::Range<i32>`
+        _ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>`
+    }
+}
+```
+
+# Motivation
+[motivation]: #motivation
+
+[Return Position Impl Trait (RPIT)][RPIT] is used when you want to return a value, but
+don't want to specify the type. In today's Rust (1.66.0 at the time of writing)
+it's only possible to use this when you're returning a single type from the
+function. The moment multiple types are returned from the function, the compiler
+will error. This can be frustrating, because it means you're likely to either
+resort to using `Box<dyn Trait>` or manually construct an enum to to map the
+branches to. It's not always desirable or possible to use `Box<dyn Trait>`. And
+constructing an enum manually can be both time-intensive, complicated, and can
+obfuscate
+ the intent of the code.
+
+What we're proposing here is not so much a new feature, as an expansion of the
+cases in which `impl Trait` can be used. We've seen previous efforts for this,
+in particular [RFC 1951: Expand Impl Trait][rfc1951] and more recently in [RFC
+2515: Type Alias Impl Trait (TAIT)][TAIT]. This continues that expansion by
+enabling more code to make use of RPIT.
+
+[rfc1951]: https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md
+[TAIT]: https://rust-lang.github.io/rfcs/2515-type_alias_impl_trait.html
+
+A motivating example for this is use in error handling: it's not uncommon to
+have a function return more than one error type, but you may not necessarily
+care about the exact errors returned. You may either choose to define a `Box<dyn
+Error + 'static>` which has the downside that [it itself does not implement
+`Error`][no-error]. Or you may choose to define your own enum of errors, which
+can be a lot of work and may obfuscate the actual intent of the code. It may
+sometimes be preferable to return an `impl Trait` instead:
+
+[no-error]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=97894fc907fa2d292cbe909467d4db4b
+
+```rust
+use std::error::Error;
+use std::fs;
+
+// ❌ Multi-type RPIT does not yet compile (Rust 1.66.0)
+// error[E0282]: type annotations needed
+fn main() -> Result<(), impl Error> {
+    let num = i8::from_str_radix("A", 16)?;       // `Result<_, std::num::ParseIntError>`
+    let file = fs::read_to_string("./file.csv")?; // `Result<_, std::io::Error>`
+    // ... use values here
+    Ok(())
+}
+```
+
+# Desugaring
+[reference-level-explanation]: #reference-level-explanation
+
+## Overview
+
+Let's take a look again at the code from our motivation section. This function
+has two branches which each return a different type which implements the
+[`Iterator` trait][`Iterator`]:
+
+[`Iterator`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html
+
+```rust
+fn multi_iter(x: i32) -> impl Iterator<Item = i32> {
+    match x {
+        0 => 1..10,                   // `std::ops::Range<i32>`
+        _ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>`
+    }
+}
+```
+
+This code should be desugared by the compiler into something resembling the following
+([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=af4c0e61df25acaada168449df9838d3)):
+
+```rust
+// anonymous enum generated by the compiler
+enum Enum {
+    A(std::ops::Range<i32>),
+    B(std::vec::IntoIter<i32>),
+}
+
+// trait implementation generated by the compiler,
+// delegates to underlying enum member's values
+impl Iterator for Enum {
+    type Item = i32;
+
+    fn next(&mut self) -> Option<Self::Item> {
+        match self {
+            Enum::A(iter) => iter.next(),
+            Enum::B(iter) => iter.next(),
+        }
+    }
+
+    // ..repeat for the remaining 74 `Iterator` trait methods
+}
+
+// the desugared function now returns the generated enum
+fn multi_iter(x: i32) -> Enum {
+    match x {
+        0 => Enum::A(1..10),
+        _ => Enum::B(vec![5, 10].into_iter()),
+    }
+}
+```
+
+## Step-by-step guide
+
+This desugaring can be implemented using the following steps:
+
+1. Find all return calls in the function
+2. Define a new enum with a member for each of the function's return types
+3. Implement the traits declared in the `-> impl Trait` bound for the new enum,
+   matching on `self` and delegating to the enum's members
+4. Substitute the `-> impl Trait` signature with the concrete enum
+5. Wrap each of the function's return calls in the appropriate enum member
+
+The hardest part of implementing this RFC will likely be the actual trait
+implementation on the enum, as each of the trait methods will need to be
+delegated to the underlying types.
+
+# Interaction with lifetimes
+
+`dyn Trait` already supports multi-type _dynamic_ dispatch. The rules we're
+proposing for multi-type _static_ dispatch using `impl Trait` should mirror the
+existing rules we apply to `dyn Trait.` We should follow the same lifetime rules
+for multi-type `impl Trait` as we do for `dyn Trait`:
+
+```rust
+fn multi_iter<'a>(x: i32, iter_a: &'a mut std::ops::Range<i32>) -> impl Iterator<Item = i32> + 'a {
+    match x {
+        0 => iter_a,                  // `&'a std::ops::Range<i32>`
+        _ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>`
+    }
+}
+```
+
+This code should be desugared by the compiler into something resembling the following
+([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=60ddacbb20c4068a0fff44a5481a7136)):
+
+```rust
+enum Enum<'a> {
+    A(&'a mut std::ops::Range<i32>),
+    B(std::vec::IntoIter<i32>),
+}
+
+impl<'a> Iterator for Enum<'a> {
+    type Item = i32;
+
+    fn next(&mut self) -> Option<Self::Item> {
+        match self {
+            Enum::A(iter) => iter.next(),
+            Enum::B(iter) => iter.next(),
+        }
+    }
+
+    // ..repeat for the remaining 74 `Iterator` trait methods
+}
+
+fn multi_iter<'a>(x: i32, iter_a: &'a mut std::ops::Range<i32>) -> Enum<'a> {
+    match x {
+        0 => Enum::A(iter_a),
+        _ => Enum::B(vec![5, 10].into_iter()),
+    }
+}
+```
+
+It should be fine if multiple iterators use the same lifetime. But only a single
+lifetime should be permitted on the return type, as is the case today when
+using `dyn Trait`:
+
+```rust
+// ❌ Fails to compile (Rust 1.66.0)
+// error[E0226]: only a single explicit lifetime bound is permitted
+fn fails<'a, 'b>() -> Box<dyn Iterator + 'a + 'b> {
+    ...
+}
+```
+
+# Prior art
+[prior-art]: #prior-art
+
+## auto-enums crate
+
+The [`auto-enums` crate][auto-enums] implements a limited variation of what is
+proposed in this RFC using procedural macros. It's limited to a predefined set
+of traits only, whereas this RFC enables multi-type RPIT to work for _all_
+traits. This limitation exists in the proc macro because it doesn't have access
+to the same type information as the compiler does, so the trait delegations
+have to be authored by hand. Here's an example of the crate being used to
+generate an `impl Iterator`:
+
+[auto-enums]: https://docs.rs/auto_enums/latest/auto_enums/
+
+```rust
+use auto_enums::auto_enum;
+
+#[auto_enum(Iterator)]
+fn foo(x: i32) -> impl Iterator<Item = i32> {
+    match x {
+        0 => 1..10,
+        _ => vec![5, 10].into_iter(),
+    }
+}
+```
+
+# Future possibilities
+[future-possibilities]: #future-possibilities
+
+## Anonymous enums
+
+Rust provides a way to declare anonymous structs using tuples. But we don't yet
+have a way to declare anonymous enums. A different way of interpreting the
+current RFC is as a way to declare anonymous type-erased enums, by expanding what
+RPIT can be used for. It stands to reason that there will be cases where people
+may want anonymous _non-type-erased_ enums too.
+
+Take for example the iterator code we've been using throughout this RFC. But
+instead of `Iterator` yielding `i32`, let's make it yield `i32` or `&'static
+str`:
+
+```rust
+fn multi_iter(x: i32) -> impl Iterator<Item = /* which type? */> {
+    match x {
+        0 => 1..10,                              // yields `i32`
+        _ => vec!["hello", "world"].into_iter(), // yields `&'static str`
+    }
+}
+```
+
+One solution to make it compile would be to first map it to a type which can
+hold *either* `i32` or `String`. The obvious answer would be to use an enum for
+this:
+
+```rust
+enum Enum {
+    A(i32),
+    B(&'static str),
+}
+
+fn multi_iter(x: i32) -> impl Iterator<Item = Enum> {
+    match x {
+        0 => 1..10.map(Enum::A),
+        _ => vec!["hello", "world"].into_iter().map(Enum::B),
+    }
+}
+```
+
+This code resembles the desugaring for multi-value RPIT we're proposing in this
+RFC. In fact: it may very well be that a lot of the internal compiler machinery
+used for multi-RPIT could be reused for anonymous enums.
+
+The similarities might become even closer if we consider how "anonymous enums"
+could be used for error handling. Sometimes it can be useful to know which error
+was returned, so you can decide how to handle it. For this RPIT isn't enough: we
+actually want to retain the underlying types so we can match on them. We might
+imagine the earlier errror example could instead be written like this:
+
+```rust
+use std::{fs, io, num};
+
+// The earlier mult-value RPIT version returned `-> Result<(), impl Error>`.
+// This example declares an anonymous enum instead, using made-up syntax
+fn main() -> Result<(), num::ParseIntError | io::Error> {
+    let num = i8::from_str_radix("A", 16)?;       // `Result<_, std::num::ParseIntError>`
+    let file = fs::read_to_string("./file.csv")?; // `Result<_, std::io::Error>`
+    // ... use values here
+    Ok(())
+}
+```
+
+There are a lot of questions to be answered here. Which traits should
+this implement? What should the declaration syntax be? How could we match on
+values? All enough to warrant its own exploration and possible RFC in the
+future.
+
+## Language-level support for delegation/proxies
+
+One of the trickiest parts of implementing this RFC will be to delegate from the
+generated enum to the individual enum's members. If we implement this
+functionality in the compiler, it may be beneficial to generalize this
+functionality and create syntax for it. We're already seen [limited support for
+delegation codegen][support] in Rust-Analyzer as a source action [^disclaimer], and [various crates]
+implementing delegation exist on Crates.io.
+
+[support]: https://github.com/rust-lang/rust-analyzer/issues/5944
+[various crates]: https://crates.io/search?q=delegate
+
+[^disclaimer]: I (Yosh) filed the issue and authored the extension to Rust-Analyzer
+for this. Which itself was based on prior art found in the VS Code Java extension.
+
+To provide some sense for what this might look like. Say we were authoring some
+[newtype] which wraps an iterator. We could imagine we'd write that in Rust
+by hand today like this:
+
+[newtype]: https://doc.rust-lang.org/rust-by-example/generics/new_types.html
+
+```rust
+struct NewIterator<T>(iter: std::array::Iterator<T>);
+
+impl<T> Iterator for NewIterator<T> {
+    type Item = T;
+
+    #[inline]
+    pub fn next(&mut self) -> Option<Self::Item> {
+        self.0.next()
+    }
+
+    // ..repeat for the remaining 74 `Iterator` trait methods
+}
+```
+
+Forwarding a single trait with a single method is doable. But we can imagine
+that repeating this for multiple traits and methods quickly becomes a hassle,
+and can obfuscate the _intent_ of the code. Instead if we could declare that
+`NewIterator` should _delegate_ its `Iterator` implementation to the iterator
+contained within. Say we adopted a [Kotlin-like syntax], we could imagine it
+could look like this:
+
+[Kotlin-like syntax]: https://kotlinlang.org/docs/delegation.html#overriding-a-member-of-an-interface-implemented-by-delegation
+
+```rust
+struct NewIterator<T>(iter: std::array::Iterator<T>);
+
+impl<T> Iterator for NewIterator<T> by Self.0; // Use `Self.0` as the `Iterator` impl
+```
+
+There are many open questions here regarding semantics, syntax, and expanding it
+to other features such as method delegation. But given the codegen for both
+multi-value RPIT and delegation will share similarities, it may be worth
+exploring further in the future.