|
| 1 | +- Feature Name: (fill me in with a unique ident, `multi_type_return_position_impl_trait`) |
| 2 | +- Start Date: (fill me in with today's date, 2023-01-05) |
| 3 | +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) |
| 4 | +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +This RFC enables [Return Position Impl Trait (RPIT)][RPIT] to work in functions |
| 10 | +which return more than one type. This is achieved by desugaring the return type |
| 11 | +into an enum with members containing each of the returned types, and |
| 12 | +implementing traits which delegate to those members: |
| 13 | + |
| 14 | +[RPIT]: https://doc.rust-lang.org/stable/rust-by-example/trait/impl_trait.html#as-a-return-type |
| 15 | + |
| 16 | +```rust |
| 17 | +// Possible already |
| 18 | +fn single_iter() -> impl Iterator<Item = i32> { |
| 19 | + 1..10 // `std::ops::Range<i32>` |
| 20 | +} |
| 21 | + |
| 22 | +// Enabled by this RFC |
| 23 | +fn multi_iter(x: i32) -> impl Iterator<Item = i32> { |
| 24 | + match x { |
| 25 | + 0 => 1..10, // `std::ops::Range<i32>` |
| 26 | + _ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>` |
| 27 | + } |
| 28 | +} |
| 29 | +``` |
| 30 | + |
| 31 | +# Motivation |
| 32 | +[motivation]: #motivation |
| 33 | + |
| 34 | +[Return Position Impl Trait (RPIT)][RPIT] is used when you want to return a value, but |
| 35 | +don't want to specify the type. In today's Rust (1.66.0 at the time of writing) |
| 36 | +it's only possible to use this when you're returning a single type from the |
| 37 | +function. The moment multiple types are returned from the function, the compiler |
| 38 | +will error. This can be frustrating, because it means you're likely to either |
| 39 | +resort to using `Box<dyn Trait>` or manually construct an enum to to map the |
| 40 | +branches to. It's not always desirable or possible to use `Box<dyn Trait>`. And |
| 41 | +constructing an enum manually can be both time-intensive, complicated, and can |
| 42 | +obfuscate |
| 43 | + the intent of the code. |
| 44 | + |
| 45 | +What we're proposing here is not so much a new feature, as an expansion of the |
| 46 | +cases in which `impl Trait` can be used. We've seen previous efforts for this, |
| 47 | +in particular [RFC 1951: Expand Impl Trait][rfc1951] and more recently in [RFC |
| 48 | +2515: Type Alias Impl Trait (TAIT)][TAIT]. This continues that expansion by |
| 49 | +enabling more code to make use of RPIT. |
| 50 | + |
| 51 | +[rfc1951]: https://github.com/rust-lang/rfcs/blob/master/text/1951-expand-impl-trait.md |
| 52 | +[TAIT]: https://rust-lang.github.io/rfcs/2515-type_alias_impl_trait.html |
| 53 | + |
| 54 | +A motivating example for this is use in error handling: it's not uncommon to |
| 55 | +have a function return more than one error type, but you may not necessarily |
| 56 | +care about the exact errors returned. You may either choose to define a `Box<dyn |
| 57 | +Error + 'static>` which has the downside that [it itself does not implement |
| 58 | +`Error`][no-error]. Or you may choose to define your own enum of errors, which |
| 59 | +can be a lot of work and may obfuscate the actual intent of the code. It may |
| 60 | +sometimes be preferable to return an `impl Trait` instead: |
| 61 | + |
| 62 | +[no-error]: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=97894fc907fa2d292cbe909467d4db4b |
| 63 | + |
| 64 | +```rust |
| 65 | +use std::error::Error; |
| 66 | +use std::fs; |
| 67 | + |
| 68 | +// ❌ Multi-type RPIT does not yet compile (Rust 1.66.0) |
| 69 | +// error[E0282]: type annotations needed |
| 70 | +fn main() -> Result<(), impl Error> { |
| 71 | + let num = i8::from_str_radix("A", 16)?; // `Result<_, std::num::ParseIntError>` |
| 72 | + let file = fs::read_to_string("./file.csv")?; // `Result<_, std::io::Error>` |
| 73 | + // ... use values here |
| 74 | + Ok(()) |
| 75 | +} |
| 76 | +``` |
| 77 | + |
| 78 | +# Desugaring |
| 79 | +[reference-level-explanation]: #reference-level-explanation |
| 80 | + |
| 81 | +## Overview |
| 82 | + |
| 83 | +Let's take a look again at the code from our motivation section. This function |
| 84 | +has two branches which each return a different type which implements the |
| 85 | +[`Iterator` trait][`Iterator`]: |
| 86 | + |
| 87 | +[`Iterator`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html |
| 88 | + |
| 89 | +```rust |
| 90 | +fn multi_iter(x: i32) -> impl Iterator<Item = i32> { |
| 91 | + match x { |
| 92 | + 0 => 1..10, // `std::ops::Range<i32>` |
| 93 | + _ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>` |
| 94 | + } |
| 95 | +} |
| 96 | +``` |
| 97 | + |
| 98 | +This code should be desugared by the compiler into something resembling the following |
| 99 | +([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=af4c0e61df25acaada168449df9838d3)): |
| 100 | + |
| 101 | +```rust |
| 102 | +// anonymous enum generated by the compiler |
| 103 | +enum Enum { |
| 104 | + A(std::ops::Range<i32>), |
| 105 | + B(std::vec::IntoIter<i32>), |
| 106 | +} |
| 107 | + |
| 108 | +// trait implementation generated by the compiler, |
| 109 | +// delegates to underlying enum member's values |
| 110 | +impl Iterator for Enum { |
| 111 | + type Item = i32; |
| 112 | + |
| 113 | + fn next(&mut self) -> Option<Self::Item> { |
| 114 | + match self { |
| 115 | + Enum::A(iter) => iter.next(), |
| 116 | + Enum::B(iter) => iter.next(), |
| 117 | + } |
| 118 | + } |
| 119 | + |
| 120 | + // ..repeat for the remaining 74 `Iterator` trait methods |
| 121 | +} |
| 122 | + |
| 123 | +// the desugared function now returns the generated enum |
| 124 | +fn multi_iter(x: i32) -> Enum { |
| 125 | + match x { |
| 126 | + 0 => Enum::A(1..10), |
| 127 | + _ => Enum::B(vec![5, 10].into_iter()), |
| 128 | + } |
| 129 | +} |
| 130 | +``` |
| 131 | + |
| 132 | +## Step-by-step guide |
| 133 | + |
| 134 | +This desugaring can be implemented using the following steps: |
| 135 | + |
| 136 | +1. Find all return calls in the function |
| 137 | +2. Define a new enum with a member for each of the function's return types |
| 138 | +3. Implement the traits declared in the `-> impl Trait` bound for the new enum, |
| 139 | + matching on `self` and delegating to the enum's members |
| 140 | +4. Substitute the `-> impl Trait` signature with the concrete enum |
| 141 | +5. Wrap each of the function's return calls in the appropriate enum member |
| 142 | + |
| 143 | +The hardest part of implementing this RFC will likely be the actual trait |
| 144 | +implementation on the enum, as each of the trait methods will need to be |
| 145 | +delegated to the underlying types. |
| 146 | + |
| 147 | +# Interaction with lifetimes |
| 148 | + |
| 149 | +`dyn Trait` already supports multi-type _dynamic_ dispatch. The rules we're |
| 150 | +proposing for multi-type _static_ dispatch using `impl Trait` should mirror the |
| 151 | +existing rules we apply to `dyn Trait.` We should follow the same lifetime rules |
| 152 | +for multi-type `impl Trait` as we do for `dyn Trait`: |
| 153 | + |
| 154 | +```rust |
| 155 | +fn multi_iter<'a>(x: i32, iter_a: &'a mut std::ops::Range<i32>) -> impl Iterator<Item = i32> + 'a { |
| 156 | + match x { |
| 157 | + 0 => iter_a, // `&'a std::ops::Range<i32>` |
| 158 | + _ => vec![5, 10].into_iter(), // `std::vec::IntoIter<i32>` |
| 159 | + } |
| 160 | +} |
| 161 | +``` |
| 162 | + |
| 163 | +This code should be desugared by the compiler into something resembling the following |
| 164 | +([playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=60ddacbb20c4068a0fff44a5481a7136)): |
| 165 | + |
| 166 | +```rust |
| 167 | +enum Enum<'a> { |
| 168 | + A(&'a mut std::ops::Range<i32>), |
| 169 | + B(std::vec::IntoIter<i32>), |
| 170 | +} |
| 171 | + |
| 172 | +impl<'a> Iterator for Enum<'a> { |
| 173 | + type Item = i32; |
| 174 | + |
| 175 | + fn next(&mut self) -> Option<Self::Item> { |
| 176 | + match self { |
| 177 | + Enum::A(iter) => iter.next(), |
| 178 | + Enum::B(iter) => iter.next(), |
| 179 | + } |
| 180 | + } |
| 181 | + |
| 182 | + // ..repeat for the remaining 74 `Iterator` trait methods |
| 183 | +} |
| 184 | + |
| 185 | +fn multi_iter<'a>(x: i32, iter_a: &'a mut std::ops::Range<i32>) -> Enum<'a> { |
| 186 | + match x { |
| 187 | + 0 => Enum::A(iter_a), |
| 188 | + _ => Enum::B(vec![5, 10].into_iter()), |
| 189 | + } |
| 190 | +} |
| 191 | +``` |
| 192 | + |
| 193 | +It should be fine if multiple iterators use the same lifetime. But only a single |
| 194 | +lifetime should be permitted on the return type, as is the case today when |
| 195 | +using `dyn Trait`: |
| 196 | + |
| 197 | +```rust |
| 198 | +// ❌ Fails to compile (Rust 1.66.0) |
| 199 | +// error[E0226]: only a single explicit lifetime bound is permitted |
| 200 | +fn fails<'a, 'b>() -> Box<dyn Iterator + 'a + 'b> { |
| 201 | + ... |
| 202 | +} |
| 203 | +``` |
| 204 | + |
| 205 | +# Prior art |
| 206 | +[prior-art]: #prior-art |
| 207 | + |
| 208 | +## auto-enums crate |
| 209 | + |
| 210 | +The [`auto-enums` crate][auto-enums] implements a limited variation of what is |
| 211 | +proposed in this RFC using procedural macros. It's limited to a predefined set |
| 212 | +of traits only, whereas this RFC enables multi-type RPIT to work for _all_ |
| 213 | +traits. This limitation exists in the proc macro because it doesn't have access |
| 214 | +to the same type information as the compiler does, so the trait delegations |
| 215 | +have to be authored by hand. Here's an example of the crate being used to |
| 216 | +generate an `impl Iterator`: |
| 217 | + |
| 218 | +[auto-enums]: https://docs.rs/auto_enums/latest/auto_enums/ |
| 219 | + |
| 220 | +```rust |
| 221 | +use auto_enums::auto_enum; |
| 222 | + |
| 223 | +#[auto_enum(Iterator)] |
| 224 | +fn foo(x: i32) -> impl Iterator<Item = i32> { |
| 225 | + match x { |
| 226 | + 0 => 1..10, |
| 227 | + _ => vec![5, 10].into_iter(), |
| 228 | + } |
| 229 | +} |
| 230 | +``` |
| 231 | + |
| 232 | +# Future possibilities |
| 233 | +[future-possibilities]: #future-possibilities |
| 234 | + |
| 235 | +## Anonymous enums |
| 236 | + |
| 237 | +Rust provides a way to declare anonymous structs using tuples. But we don't yet |
| 238 | +have a way to declare anonymous enums. A different way of interpreting the |
| 239 | +current RFC is as a way to declare anonymous type-erased enums, by expanding what |
| 240 | +RPIT can be used for. It stands to reason that there will be cases where people |
| 241 | +may want anonymous _non-type-erased_ enums too. |
| 242 | + |
| 243 | +Take for example the iterator code we've been using throughout this RFC. But |
| 244 | +instead of `Iterator` yielding `i32`, let's make it yield `i32` or `&'static |
| 245 | +str`: |
| 246 | + |
| 247 | +```rust |
| 248 | +fn multi_iter(x: i32) -> impl Iterator<Item = /* which type? */> { |
| 249 | + match x { |
| 250 | + 0 => 1..10, // yields `i32` |
| 251 | + _ => vec!["hello", "world"].into_iter(), // yields `&'static str` |
| 252 | + } |
| 253 | +} |
| 254 | +``` |
| 255 | + |
| 256 | +One solution to make it compile would be to first map it to a type which can |
| 257 | +hold *either* `i32` or `String`. The obvious answer would be to use an enum for |
| 258 | +this: |
| 259 | + |
| 260 | +```rust |
| 261 | +enum Enum { |
| 262 | + A(i32), |
| 263 | + B(&'static str), |
| 264 | +} |
| 265 | + |
| 266 | +fn multi_iter(x: i32) -> impl Iterator<Item = Enum> { |
| 267 | + match x { |
| 268 | + 0 => 1..10.map(Enum::A), |
| 269 | + _ => vec!["hello", "world"].into_iter().map(Enum::B), |
| 270 | + } |
| 271 | +} |
| 272 | +``` |
| 273 | + |
| 274 | +This code resembles the desugaring for multi-value RPIT we're proposing in this |
| 275 | +RFC. In fact: it may very well be that a lot of the internal compiler machinery |
| 276 | +used for multi-RPIT could be reused for anonymous enums. |
| 277 | + |
| 278 | +The similarities might become even closer if we consider how "anonymous enums" |
| 279 | +could be used for error handling. Sometimes it can be useful to know which error |
| 280 | +was returned, so you can decide how to handle it. For this RPIT isn't enough: we |
| 281 | +actually want to retain the underlying types so we can match on them. We might |
| 282 | +imagine the earlier errror example could instead be written like this: |
| 283 | + |
| 284 | +```rust |
| 285 | +use std::{fs, io, num}; |
| 286 | + |
| 287 | +// The earlier mult-value RPIT version returned `-> Result<(), impl Error>`. |
| 288 | +// This example declares an anonymous enum instead, using made-up syntax |
| 289 | +fn main() -> Result<(), num::ParseIntError | io::Error> { |
| 290 | + let num = i8::from_str_radix("A", 16)?; // `Result<_, std::num::ParseIntError>` |
| 291 | + let file = fs::read_to_string("./file.csv")?; // `Result<_, std::io::Error>` |
| 292 | + // ... use values here |
| 293 | + Ok(()) |
| 294 | +} |
| 295 | +``` |
| 296 | + |
| 297 | +There are a lot of questions to be answered here. Which traits should |
| 298 | +this implement? What should the declaration syntax be? How could we match on |
| 299 | +values? All enough to warrant its own exploration and possible RFC in the |
| 300 | +future. |
| 301 | + |
| 302 | +## Language-level support for delegation/proxies |
| 303 | + |
| 304 | +One of the trickiest parts of implementing this RFC will be to delegate from the |
| 305 | +generated enum to the individual enum's members. If we implement this |
| 306 | +functionality in the compiler, it may be beneficial to generalize this |
| 307 | +functionality and create syntax for it. We're already seen [limited support for |
| 308 | +delegation codegen][support] in Rust-Analyzer as a source action [^disclaimer], and [various crates] |
| 309 | +implementing delegation exist on Crates.io. |
| 310 | + |
| 311 | +[support]: https://github.com/rust-lang/rust-analyzer/issues/5944 |
| 312 | +[various crates]: https://crates.io/search?q=delegate |
| 313 | + |
| 314 | +[^disclaimer]: I (Yosh) filed the issue and authored the extension to Rust-Analyzer |
| 315 | +for this. Which itself was based on prior art found in the VS Code Java extension. |
| 316 | + |
| 317 | +To provide some sense for what this might look like. Say we were authoring some |
| 318 | +[newtype] which wraps an iterator. We could imagine we'd write that in Rust |
| 319 | +by hand today like this: |
| 320 | + |
| 321 | +[newtype]: https://doc.rust-lang.org/rust-by-example/generics/new_types.html |
| 322 | + |
| 323 | +```rust |
| 324 | +struct NewIterator<T>(iter: std::array::Iterator<T>); |
| 325 | + |
| 326 | +impl<T> Iterator for NewIterator<T> { |
| 327 | + type Item = T; |
| 328 | + |
| 329 | + #[inline] |
| 330 | + pub fn next(&mut self) -> Option<Self::Item> { |
| 331 | + self.0.next() |
| 332 | + } |
| 333 | + |
| 334 | + // ..repeat for the remaining 74 `Iterator` trait methods |
| 335 | +} |
| 336 | +``` |
| 337 | + |
| 338 | +Forwarding a single trait with a single method is doable. But we can imagine |
| 339 | +that repeating this for multiple traits and methods quickly becomes a hassle, |
| 340 | +and can obfuscate the _intent_ of the code. Instead if we could declare that |
| 341 | +`NewIterator` should _delegate_ its `Iterator` implementation to the iterator |
| 342 | +contained within. Say we adopted a [Kotlin-like syntax], we could imagine it |
| 343 | +could look like this: |
| 344 | + |
| 345 | +[Kotlin-like syntax]: https://kotlinlang.org/docs/delegation.html#overriding-a-member-of-an-interface-implemented-by-delegation |
| 346 | + |
| 347 | +```rust |
| 348 | +struct NewIterator<T>(iter: std::array::Iterator<T>); |
| 349 | + |
| 350 | +impl<T> Iterator for NewIterator<T> by Self.0; // Use `Self.0` as the `Iterator` impl |
| 351 | +``` |
| 352 | + |
| 353 | +There are many open questions here regarding semantics, syntax, and expanding it |
| 354 | +to other features such as method delegation. But given the codegen for both |
| 355 | +multi-value RPIT and delegation will share similarities, it may be worth |
| 356 | +exploring further in the future. |
0 commit comments