diff --git a/text/3573-is.md b/text/3573-is.md new file mode 100644 index 00000000000..f355b03b91d --- /dev/null +++ b/text/3573-is.md @@ -0,0 +1,299 @@ +- Feature Name: `is` +- Start Date: 2024-02-16 +- RFC PR: [rust-lang/rfcs#3573](https://github.com/rust-lang/rfcs/pull/3573) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary + +Introduce an `is` operator in Rust 2024, to test if an expression matches a +pattern and bind the variables in the pattern. + +# Motivation + +This RFC introduces an `is` operator that tests if an expression matches a +pattern, and if so, binds the variables bound by the pattern and evaluates to +true. This operator can be used as part of any boolean expression, and combined +with boolean operators. + +Previous discussions around `let`-chains have treated the `is` operator as an +alternative on the basis that they serve similar functions, rather than +proposing that they can and should coexist. This RFC proposes that we allow +`let`-chaining *and* add the `is` operator. + +`if`-`let` provides a natural extension of `let` for fallible bindings, and +highlights the binding by putting it on the left, like a `let` statement. +Allowing developers to chain multiple `let` operations and other expressions in +the `if` condition provides a natural extension that simplifies what would +otherwise require complex nested conditionals. As the `let`-chains RFC notes, +this is a feature people already expect to work. + +The `is` operator similarly allows developers to chain multiple match-and-bind +operations and simplify what would otherwise require complex nested +conditionals. However, the `is` operator allows writing and reading a pattern +match from left-to-right, which reads more naturally in many circumstances. For +instance, consider an expression like `x is Some(y) && y > 5`; that boolean +expression reads more naturally from left-to-right than +`let Some(y) = x && y > 5`. + +This is even more true at the end of a longer expression chain, such as +`x.method()?.another_method().await? is Some(y)`. Rust method chaining and `?` +and `.await` all encourage writing code that reads in operation order from left +to right, and `is` fits naturally at the end of such a sequence. + +Having an `is` operator would also help to reduce the proliferation of methods +on types such as `Option` and `Result`, by allowing prospective users of those +methods to write a condition using `is` instead. While any such condition could +equivalently be expressed using `let`-chains, the binding would then move +further away from the condition expression referencing the binding, which would +result in a less natural reading order for the expression. + +Consider the following examples: + +```rust +if expr_producing_option().is_some_and(|v| condition(v)) + +if let Some(v) = expr_producing_option() && condition(v) + +if expr_producing_option() is Some(v) && condition(v) +``` + +The condition using `is` is a natural translation from the `is_some_and` +method, whereas the if-let construction requires reversing the binding of `v` +and the expression producing the option. This seems sufficiently cumbersome in +some cases that the absence of `is` would motivate continued use and +development of helper methods. + +# Guide-level explanation + +Rust provides an `is` operator, which can be used in any expression: +`EXPR is PATTERN` + +This operator tests if the value of `EXPR` matches the specified `PATTERN`; see + for details on patterns. + +If the `EXPR` matches the `PATTERN`, the `is` expression evaluates to `true`, +and additionally binds any bindings specified in `PATTERN` in the current scope +for code subsequently executed along the path where the `is` expression +is known to be `true`. + +For example: + +```rust +if an_option is Some(x) && x > 3 { + println!("{x}"); +} +``` + +The bindings in the pattern are not bound along any code path potentially +reachable where the expression did not match: + +```rust +if (an_option is Some(x) && x > 3) || (more_conditions /* x is not bound here*/) { + // x is not bound here +} else { + // x is not bound here +} +// x is not bound here +``` + +The pattern may use alternation (within parentheses), but must have the same +bindings in every alternative: + +```rust +if color is (RGB(r, g, b) | RGBA(r, g, b, _)) && r == b && g < 10 { + println!("condition met") +} + +// ERROR: `a` is not bound in all alternatives of the pattern +if color is (RGB(r, g, b) | RGBA(r, g, b, a)) && r == b && g < 10 { + println!("condition met") +} +``` + +`is` may appear anywhere a boolean expression is accepted: + +```rust +func(x is Some(y) && y > 3); +``` + +The `is` operator may not appear as a statement, because the bindings won't be +usable after the end of the statement, and because the boolean return value is +ignored. The compiler will issue a deny-by-default lint in that case, and +suggest using `let` if the user wants to bind a pattern for the rest of the +scope. + +```rust +// deny-by-default lint: the binding `x` isn't usable after the statement, and the value of `is` is ignored. +// Suggestion: use `let` to introduce a binding in this scope: `let x = an_expression();`. +an_expression() is x; +``` + +# Reference-level explanation + +Add a new [operator +expression](https://doc.rust-lang.org/reference/expressions/operator-expr.html), +`IsExpression`: + +> **Syntax**\ +> _IsExpression_ :\ +>    _Expression_ `is` _PatternNoTopAlt_ + +Add `is` to the [operator +precedence](https://doc.rust-lang.org/reference/expressions.html#expression-precedence) +table, at the same precedence level as `==`, and likewise non-associative +(requiring parentheses). + +Detect `is` appearing as a top-level statement and produce an error, with a +rustfix suggestion to use `let` instead. + +# Drawbacks + +Introducing both the `is` operator and `let`-chains would provide two different +ways to do a pattern match as part of a condition. Having more than one way to +do something could lead people to wonder if there's a difference; we would need +to clearly communicate that they serve similar purposes. + +An `is` operator will produce a name conflict with [the `is` method on +`dyn Any`](https://doc.rust-lang.org/std/any/trait.Any.html#method.is) in the +standard library, and with the (relatively few) methods named `is` in the +ecosystem. This will not break any existing Rust code, as the operator will +only exist in the Rust 2024 edition and newer. The Rust standard library and +any other library that wants to avoid requiring the use of `r#is` in Rust 2024 +and newer could provide aliases of these methods under a new name; for +instance, the standard library could additionally provide `Any::is` under a new +name `is_type`. + +# Rationale and alternatives + +As noted in the [motivation](#motivation) section, adding the `is` operator +allows writing pattern matches from left-to-right, which reads more naturally +in some conditionals and fits well with method chains and similar. As noted +under [prior art](#prior-art), other languages such as C# already have this +exact operator for this exact purpose. + +We could choose not to add this operator, and have *only* `let`-chains. This +would provide equivalent functionality, semantically; however, it would force +pattern-matches to be written with the pattern on the left, which won't read as +naturally in some expressions. Notably, this seems unlikely to do as effective +a job of reducing the desire for `is_variant()` methods and helpers like +`is_some_and(...)`. + +We could choose to add the `is` operator and *not* add `let`-chains. However, +many people *already* expect `let`-chains to work as an obvious extrapolation +from seeing `if let`/`while let` syntax. + +We could add this operator using punctuation instead (e.g. `~`). However, there +is no "natural" operator that conveys "pattern match" to people (the way that +`+` is well-known as addition). Using punctuation also seems likely to make the +language more punctuation-heavy, less obvious, and less readable. + +We could add this operator using a different name. Most other names, however, +seem likely to be longer and to fit people's expectations less well, given the +widespread precedent of `is_xyz()` methods. The most obvious choice would be +something like `matches`. + +We could permit top-level alternation in the pattern. However, this seems +likely to produce visual and semantic ambiguity. This is technically a one-way +door, in that `x is true|false` would parse differently depending on our +decision here; however, the use of a pattern-match for a boolean here seems +unlikely, redundant, and in poor style. In any case, the compiler could easily +detect most attempts at top-level alternation and suggest adding parentheses. + +# Prior art + +`let`-chains provide prior art for having this functionality in the language. + +The `matches!` macro similarly provides precedent for having pattern matches in +boolean expressions. `is` would likely be a natural replacement for most uses +of `matches!`. + +Many Rust enums provide `is_variant()` functions: +- `is_some()` and `is_none()` for `Option` +- `is_ok()` and `is_err()` for `Result` +- `is_eq()` and `is_lt()` and `is_gt()` for `Ordering` +- `is_ipv4()` and `is_ipv6()` for `SocketAddr` +- `is_break()` and `is_continue()` for `ControlFlow` +- `is_borrowed()` and `is_owned()` for `Cow` +- `is_pending()` and `is_ready()` for `Poll` + +These functions serve as precedent for using the word `is` for this purpose. +(Note that this RFC does *not* propose to encourage people to switch away from +these methods. Among other things, users may wish to use e.g. `Option::is_some` +as a function pointer rather than having to write a closure using `is`.) + +There's extensive prior art in Rust for having more than one way to accomplish +the same thing. You can write a `for` loop or you can write iterator code. You +can use combinators or write a `match` or write an `if let`. You can write +`let`-`else` or use a `match`. You can write `x > 3` or `3 < x`. You can write +`x + 3` or `3 + x`. Rust does not normatively require one alternative. We do, +in general, avoid adding constructs that are entirely redundant with each +other. However, this RFC proposes that the constructs are *not* redundant: some +code will be more readable with `let`-chains, and some code will be more +readable with `is`. + +[Kotlin has a similar `is` +operator](https://kotlinlang.org/docs/typecasts.html#smart-casts) for casts to +a type, which are similarly flow-sensitive: in the code path where the `is` +test has succeeded, subsequent code can use the tested value as that type. + +[C# has an `is` +operator](https://learn.microsoft.com/en-US/dotnet/csharp/language-reference/operators/is) +for type-matching and pattern matching, which supports the same style of +chaining as the proposed `is` operator for Rust. For instance, the following +are valid C# code: + +```csharp +if (expr is int x && other_expr is int y) +{ + func(x - y); +} + +if (bounding_box is { P1.X: 0 } or { P2.Y: 0 }) +{ + check(bounding_box); +} + +if (GetData() is var data + && data.Field == value + && data.OtherField is [2, 4, 6]) +{ + show(data); +} +``` + +# Unresolved questions + +Can we make `x is 10..=20` work without requiring the user to parenthesize the +pattern, or would that not be possible with our precedence? We could +potentially make this work over an edition boundary, but would it be worth the +churn? + +Pattern types propose using `is` for a different purpose, in types rather than +in expressions: `u32 is 1..` would be a `u32` that can never be `0`, and +`Result is Err(_)` would be a `Result` that can never be the `Ok` +variant. Can we introduce the `is` operator in expressions without conflicting +with its potential use in pattern types? We could require the use of +parentheses in `v as u32 is 1..`, to force it to be parsed as either `(v as +u32) is 1..` or `v as (u32 is 1..)` (assuming that pattern types can be used in +`as` in the first place). + +What new method name should we add to `dyn Any` as an alias for `is`? (This +does not need to be settled in this RFC, but should be handled +contemporaneously with the implementation, and tracked in the tracking issue.) + +Should we add a rustc lint or clippy lint (with rustfix suggestion) to turn +`matches!` into `is`? + +# Future possibilities + +As with `let`-chains, we *could* potentially allow cases involving `||` to bind +the same patterns, such as `expr1 is V1(value) || expr2 is V2(value)`. This RFC +does *not* propose allowing that syntax to bind variables, to avoid confusing +code. + +*If* in a future edition we decide to allow this for `let` chains, we should +similarly allow it for `is`. This RFC does not make or recommend such a future +proposal. + +We could choose to offer a clippy lint or even a rustc lint, with a rustfix +suggestion, to turn `matches!` into `is`.