|
| 1 | +- Feature Name: `slice_patterns` |
| 2 | +- Start Date: 2018-03-08 |
| 3 | +- RFC PR: [rust-lang/rfcs#2359](https://github.com/rust-lang/rfcs/pull/2359) |
| 4 | +- Rust Issue: [rust-lang/rust#62254](https://github.com/rust-lang/rust/issues/62254) |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +Permit matching sub-slices and sub-arrays with the syntax `..`. |
| 10 | +Binding a variable to the expression matched by a subslice pattern can be done |
| 11 | +using syntax `<IDENT> @ ..` similar to the existing `<IDENT> @ <PAT>` syntax, for example: |
| 12 | + |
| 13 | +```rust |
| 14 | +// Binding a sub-array: |
| 15 | +let [x, y @ .., z] = [1, 2, 3, 4]; // `y: [i32, 2] = [2, 3]` |
| 16 | + |
| 17 | +// Binding a sub-slice: |
| 18 | +let [x, y @ .., z]: &[u8] = &[1, 2, 3, 4]; // `y: &[i32] = &[2, 3]` |
| 19 | +``` |
| 20 | + |
| 21 | +# Motivation |
| 22 | +[motivation]: #motivation |
| 23 | + |
| 24 | +## General motivation |
| 25 | +Stabilization of slice pattern with subslices is currently blocked on finalizing syntax for |
| 26 | +these subslices. |
| 27 | +This RFC proposes a syntax for stabilization. |
| 28 | + |
| 29 | +## Motivation for the specific syntax |
| 30 | + |
| 31 | +### The shortcut form: `..` |
| 32 | + |
| 33 | +This form is already used in the meaning "rest of the list" in struct patterns, tuple struct |
| 34 | +patterns and tuple patterns so it would be logical to use it for slice patterns as well. |
| 35 | +And indeed, in unstable Rust `..` is used in this meaning since long before 1.0. |
| 36 | + |
| 37 | +# Guide-level explanation |
| 38 | +[guide-level-explanation]: #guide-level-explanation |
| 39 | + |
| 40 | +Sub-slices and sub-arrays can be matched using `..` and `<IDENT> @ ..` can be used to bind |
| 41 | +these sub-slices and sub-arrays to an identifier. |
| 42 | + |
| 43 | +```rust |
| 44 | +// Matching slices using `ref` and `ref mut`patterns: |
| 45 | +let mut v = vec![1, 2, 3]; |
| 46 | +match v[..] { |
| 47 | + [1, ref subslice @ .., 4] => assert_eq!(subslice.len(), 1), // typeof(subslice) == &[i32] |
| 48 | + [5, ref subslice @ ..] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &[i32] |
| 49 | + [ref subslice @ .., 6] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &[i32] |
| 50 | + [x, .., y] => assert!(v.len() >= 2), |
| 51 | + [..] => {} // Always matches |
| 52 | +} |
| 53 | +match v[..] { |
| 54 | + [1, ref mut subslice @ .., 4] => assert_eq!(subslice.len(), 1), // typeof(subslice) == &mut [i32] |
| 55 | + [5, ref mut subslice @ ..] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &mut [i32] |
| 56 | + [ref mut subslice @ .., 6] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &mut [i32] |
| 57 | + [x, .., y] => assert!(v.len() >= 2), |
| 58 | + [..] => {} // Always matches |
| 59 | +} |
| 60 | + |
| 61 | +// Matching slices using default-binding-modes: |
| 62 | +let mut v = vec![1, 2, 3]; |
| 63 | +match &v[..] { |
| 64 | + [1, subslice @ .., 4] => assert_eq!(subslice.len(), 1), // typeof(subslice) == &[i32] |
| 65 | + [5, subslice @ ..] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &[i32] |
| 66 | + [subslice @ .., 6] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &[i32] |
| 67 | + [x, .., y] => assert!(v.len() >= 2), |
| 68 | + [..] => {} // Always matches |
| 69 | +} |
| 70 | +match &mut v[..] { |
| 71 | + [1, subslice @ .., 4] => assert_eq!(subslice.len(), 1), // typeof(subslice) == &mut [i32] |
| 72 | + [5, subslice @ ..] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &mut [i32] |
| 73 | + [subslice @ .., 6] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &mut [i32] |
| 74 | + [x, .., y] => assert!(v.len() >= 2), |
| 75 | + [..] => {} // Always matches |
| 76 | +} |
| 77 | + |
| 78 | +// Matching slices by value (error): |
| 79 | +let mut v = vec![1, 2, 3]; |
| 80 | +match v[..] { |
| 81 | + [x @ ..] => {} // ERROR cannot move out of type `[i32]`, a non-copy slice |
| 82 | +} |
| 83 | + |
| 84 | +// Matching arrays by-value and by reference (explicitly or using default-binding-modes): |
| 85 | +let mut v = [1, 2, 3]; |
| 86 | +match v { |
| 87 | + [1, subarray @ .., 3] => assert_eq!(subarray, [2]), // typeof(subarray) == [i32; 1] |
| 88 | + [5, subarray @ ..] => has_type::<[i32; 2]>(subarray), // typeof(subarray) == [i32; 2] |
| 89 | + [subarray @ .., 6] => has_type::<[i32, 2]>(subarray), // typeof(subarray) == [i32; 2] |
| 90 | + [x, .., y] => has_type::<i32>(x), |
| 91 | + [..] => {}, |
| 92 | +} |
| 93 | +match v { |
| 94 | + [1, ref subarray @ .., 3] => assert_eq!(subarray, [2]), // typeof(subarray) == &[i32; 1] |
| 95 | + [5, ref subarray @ ..] => has_type::<&[i32; 2]>(subarray), // typeof(subarray) == &[i32; 2] |
| 96 | + [ref subarray @ .., 6] => has_type::<&[i32, 2]>(subarray), // typeof(subarray) == &[i32; 2] |
| 97 | + [x, .., y] => has_type::<&i32>(x), |
| 98 | + [..] => {}, |
| 99 | +} |
| 100 | +match &mut v { |
| 101 | + [1, subarray @ .., 3] => assert_eq!(subarray, [2]), // typeof(subarray) == &mut [i32; 1] |
| 102 | + [5, subarray @ ..] => has_type::<&mut [i32; 2]>(subarray), // typeof(subarray) == &mut [i32; 2] |
| 103 | + [subarray @ .., 6] => has_type::<&mut [i32, 2]>(subarray), // typeof(subarray) == &mut [i32; 2] |
| 104 | + [x, .., y] => has_type::<&mut i32>(x), |
| 105 | + [..] => {}, |
| 106 | +} |
| 107 | +``` |
| 108 | + |
| 109 | +# Reference-level explanation |
| 110 | +[reference-level-explanation]: #reference-level-explanation |
| 111 | + |
| 112 | +`..` can be used as a pattern fragment for matching sub-slices and sub-arrays. |
| 113 | + |
| 114 | +The fragment's syntax is: |
| 115 | +``` |
| 116 | +SUBSLICE = .. | BINDING @ .. |
| 117 | +BINDING = ref? mut? IDENT |
| 118 | +``` |
| 119 | + |
| 120 | +The subslice fragment incorporates into the full subslice syntax in the same way as the `..` |
| 121 | +fragment incorporates into the stable tuple pattern syntax (with regards to allowed number of |
| 122 | +subslices, trailing commas, etc). |
| 123 | + |
| 124 | +`@` can be used to bind the result of `..` to an identifier. |
| 125 | + |
| 126 | +`..` is treated as a "non-reference-pattern" for the purpose of determining default-binding-modes, |
| 127 | +and so shifts the binding mode to by-`ref` or by-`ref mut` when used to match a subsection of a |
| 128 | +reference or mutable reference to a slice or array. |
| 129 | + |
| 130 | +When used to match against a non-reference slice (`[u8]`), `x @ ..` would attempt to bind |
| 131 | +by-value, which would fail due a move from a non-copy type `[u8]`. |
| 132 | + |
| 133 | +`..` is not a full pattern syntax, but rather a part of slice, tuple and tuple |
| 134 | +struct pattern syntaxes. In particular, `..` is not accepted by the `pat` macro matcher. |
| 135 | +`BINDING @ ..` is also not a full pattern syntax, but rather a part of slice pattern syntax, so |
| 136 | +it is not accepted by the `pat` macro matcher either. |
| 137 | + |
| 138 | +# Drawbacks |
| 139 | +[drawbacks]: #drawbacks |
| 140 | + |
| 141 | +None known. |
| 142 | + |
| 143 | +# Rationale and alternatives |
| 144 | +[alternatives]: #alternatives |
| 145 | + |
| 146 | +More complex syntaxes derived from `..` are possible, they use additional tokens to avoid the |
| 147 | +ambiguity with ranges, for example |
| 148 | +[`..PAT..`](https://github.com/rust-lang/rust/issues/23121#issuecomment-301485132), or |
| 149 | +[`.. @ PAT`](https://github.com/rust-lang/rust/issues/23121#issuecomment-280920062) or |
| 150 | +[`PAT @ ..`](https://github.com/rust-lang/rust/issues/23121#issuecomment-280906823), or other |
| 151 | +similar alternatives. |
| 152 | +We reject these syntaxes because they only bring benefits in contrived cases using a |
| 153 | +feature that doesn't even exist yet, but normally they only add symbolic noise. |
| 154 | + |
| 155 | +More radical syntax changes do not keep consistency with `..`, for example |
| 156 | +[`[1, 2, 3, 4] ++ ref v`](https://github.com/rust-lang/rust/issues/23121#issuecomment-289220169). |
| 157 | + |
| 158 | +### `..PAT` or `PAT..` |
| 159 | + |
| 160 | +If `..` is used in the meaning "match the subslice (`>=0` elements) and ignore it", then it's |
| 161 | +reasonable to expect that syntax for "match the subslice to a pattern" should be some variation |
| 162 | +on `..`. |
| 163 | +The two simplest variations are `..PAT` and `PAT..`. |
| 164 | + |
| 165 | +#### Ambiguity |
| 166 | + |
| 167 | +The issue is that these syntaxes are ambiguous with half-bounded ranges `..END` and `BEGIN..`, |
| 168 | +and the full range `..`. |
| 169 | +To be precise, such ranges are not currently supported in patterns, but they may be supported in |
| 170 | +the future. |
| 171 | + |
| 172 | +Syntactic ambiguity is not inherently bad. We see it every day in expressions like |
| 173 | +`a + b * c`. What is important is to disambiguate it reasonably by default and have a way to |
| 174 | +group operands in the alternative way when default disambiguation turns out to be incorrect. |
| 175 | +In case of slice patterns the subslice interpretation seems more likely, so we |
| 176 | +can take it as a default. |
| 177 | +There was very little demand for implementing half-bounded ranges in patterns so far |
| 178 | +(see https://github.com/rust-lang/rfcs/issues/947), but if they |
| 179 | +are implemented in the future they will be able to be used in slice patterns as well, but they |
| 180 | +could require explicit grouping with recently implemented |
| 181 | +[parentheses in patterns](https://github.com/rust-lang/rust/pull/48500) (`[a, (..end)]`) or an |
| 182 | +explicitly written start boundary (`[a, 0 .. end]`). |
| 183 | +We can also make *some* disambiguation effort and, for example, interpret `..LITERAL` as a |
| 184 | +range because `LITERAL` can never match a subslice. Time will show if such an effort is necessary |
| 185 | +or not. |
| 186 | + |
| 187 | +If/when half-bounded ranges are supported in patterns, for better future compatibility we could |
| 188 | +decide to reserve `..PAT` as "rest of the list" in tuples and tuple structs as well, and avoid |
| 189 | +interpreting it as a range pattern in those positions. |
| 190 | + |
| 191 | +Note that ambiguity with unbounded ranges as they are used in expressions (`..`) already exists in |
| 192 | +variant `Variant(..)` and tuple `(a, b, ..)` patterns, but it's unlikely that the `..` syntax |
| 193 | +will ever be used in patterns in the range meaning because it duplicates functionality of the |
| 194 | +wildcard pattern `_`. |
| 195 | + |
| 196 | +#### `..PAT` vs `PAT..` |
| 197 | + |
| 198 | +Originally Rust used syntax `..PAT` for subslice patterns. |
| 199 | +In 2014 the syntax was changed to `PAT..` by [RFC 202](https://github.com/rust-lang/rfcs/pull/202). |
| 200 | +That RFC received almost no discussion before it got merged and its motivation is no longer |
| 201 | +relevant because arrays now use syntax `[T; N]` instead of `[T, ..N]` used in old Rust. |
| 202 | + |
| 203 | +This RFC originally proposed to switch back to `..PAT`. |
| 204 | +Some reasons to switch were: |
| 205 | +- Symmetry with expressions. |
| 206 | +One of the general ideas behind patterns is that destructuring with |
| 207 | +patterns has the same syntax as construction with expressions, if possible. |
| 208 | +In expressions we already have something with the meaning "rest of the list" - functional record |
| 209 | +update in struct expressions `S { field1, field2, ..remaining_fields }`. |
| 210 | +Right now we can use `S { field1, field1, .. }` in a pattern, but can't bind the remaining fields |
| 211 | +as a whole (by creating a new struct type on the fly, for example). It's not inconceivable that |
| 212 | +in Rust 2030 we have such ability and it's reasonable to expect it using syntax `..remaining_fields` |
| 213 | +symmetric to expressions. It would be good for slice patterns to be consistent with it. |
| 214 | +Without speculations, even if `..remaining_fields` in struct expressions and `..subslice` in slice |
| 215 | +patterns are not entirely the same thing, they are similar enough to keep them symmetric already. |
| 216 | +- Simple disambiguation. |
| 217 | +When we are parsing a slice pattern and see `..` we immediately know it's |
| 218 | +a subslice and can parse following tokens as a pattern (unless they are `,` or `]`, then it's just |
| 219 | +`..`, without an attached pattern). |
| 220 | +With `PAT..` we need to consume the pattern first, but that pattern may be a... `RANGE_BEGIN..` |
| 221 | +range pattern, then it means that we consumed too much and need to reinterpret the parsed tokens |
| 222 | +somehow. It's probably possible to make this work, but it's some headache that we would like to |
| 223 | +avoid if possible. |
| 224 | + |
| 225 | +This RFC no longer includes the addition of `..PAT` or `PAT..`. |
| 226 | +The currently-proposed change is a minimal addition to patterns (`..` for slices) which |
| 227 | +already exists in other forms (e.g. tuples) and generalizes well to pattern-matching out sub-tuples, |
| 228 | +e.g. `let (a, b @ .., c) = (1, 2, 3, 4);`. |
| 229 | + |
| 230 | +Additionally, `@` is more consistent with the types of patterns that would be allowable for matching |
| 231 | +slices (only identifiers), whereas `PAT..`/`..PAT` suggest the ability to write e.g. `..(1, x)` or |
| 232 | +`..SomeStruct { x }` sub-patterns, which wouldn't be possible since the resulting bound variables |
| 233 | +don't form a slice (since they're spread out in memory). |
| 234 | + |
| 235 | +# Prior art |
| 236 | +[prior-art]: #prior-art |
| 237 | + |
| 238 | +Some other languages like Haskell (`first_elem : rest_of_the_list`), |
| 239 | +Scala, or F# (`first_elem :: rest_of_the_list`) has list/array patterns, but their |
| 240 | +syntactic choices are quite different from Rust's general style. |
| 241 | + |
| 242 | +"Rest of the list" in patterns was previously discussed in |
| 243 | +[RFC 1492](https://github.com/rust-lang/rfcs/pull/1492) |
| 244 | + |
| 245 | +# Unresolved questions |
| 246 | +[unresolved]: #unresolved-questions |
| 247 | + |
| 248 | +None known. |
| 249 | + |
| 250 | +# Future possibilities |
| 251 | +[future-possibilities]: #future-possibilities |
| 252 | + |
| 253 | +Turn `..` into a full pattern syntactically accepted in any pattern position, |
| 254 | +(including `pat` matchers in macros), but rejected semantically outside of slice and tuple patterns. |
0 commit comments