Skip to content

Commit 60b973a

Browse files
authored
Merge pull request #2359 from petrochenkov/subsl
RFC: Finalize syntax for slice patterns with subslices
2 parents b645f14 + a99445e commit 60b973a

File tree

1 file changed

+254
-0
lines changed

1 file changed

+254
-0
lines changed

text/2359-subslice-pattern-syntax.md

+254
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,254 @@
1+
- Feature Name: `slice_patterns`
2+
- Start Date: 2018-03-08
3+
- RFC PR: [rust-lang/rfcs#2359](https://github.com/rust-lang/rfcs/pull/2359)
4+
- Rust Issue: [rust-lang/rust#62254](https://github.com/rust-lang/rust/issues/62254)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Permit matching sub-slices and sub-arrays with the syntax `..`.
10+
Binding a variable to the expression matched by a subslice pattern can be done
11+
using syntax `<IDENT> @ ..` similar to the existing `<IDENT> @ <PAT>` syntax, for example:
12+
13+
```rust
14+
// Binding a sub-array:
15+
let [x, y @ .., z] = [1, 2, 3, 4]; // `y: [i32, 2] = [2, 3]`
16+
17+
// Binding a sub-slice:
18+
let [x, y @ .., z]: &[u8] = &[1, 2, 3, 4]; // `y: &[i32] = &[2, 3]`
19+
```
20+
21+
# Motivation
22+
[motivation]: #motivation
23+
24+
## General motivation
25+
Stabilization of slice pattern with subslices is currently blocked on finalizing syntax for
26+
these subslices.
27+
This RFC proposes a syntax for stabilization.
28+
29+
## Motivation for the specific syntax
30+
31+
### The shortcut form: `..`
32+
33+
This form is already used in the meaning "rest of the list" in struct patterns, tuple struct
34+
patterns and tuple patterns so it would be logical to use it for slice patterns as well.
35+
And indeed, in unstable Rust `..` is used in this meaning since long before 1.0.
36+
37+
# Guide-level explanation
38+
[guide-level-explanation]: #guide-level-explanation
39+
40+
Sub-slices and sub-arrays can be matched using `..` and `<IDENT> @ ..` can be used to bind
41+
these sub-slices and sub-arrays to an identifier.
42+
43+
```rust
44+
// Matching slices using `ref` and `ref mut`patterns:
45+
let mut v = vec![1, 2, 3];
46+
match v[..] {
47+
[1, ref subslice @ .., 4] => assert_eq!(subslice.len(), 1), // typeof(subslice) == &[i32]
48+
[5, ref subslice @ ..] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &[i32]
49+
[ref subslice @ .., 6] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &[i32]
50+
[x, .., y] => assert!(v.len() >= 2),
51+
[..] => {} // Always matches
52+
}
53+
match v[..] {
54+
[1, ref mut subslice @ .., 4] => assert_eq!(subslice.len(), 1), // typeof(subslice) == &mut [i32]
55+
[5, ref mut subslice @ ..] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &mut [i32]
56+
[ref mut subslice @ .., 6] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &mut [i32]
57+
[x, .., y] => assert!(v.len() >= 2),
58+
[..] => {} // Always matches
59+
}
60+
61+
// Matching slices using default-binding-modes:
62+
let mut v = vec![1, 2, 3];
63+
match &v[..] {
64+
[1, subslice @ .., 4] => assert_eq!(subslice.len(), 1), // typeof(subslice) == &[i32]
65+
[5, subslice @ ..] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &[i32]
66+
[subslice @ .., 6] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &[i32]
67+
[x, .., y] => assert!(v.len() >= 2),
68+
[..] => {} // Always matches
69+
}
70+
match &mut v[..] {
71+
[1, subslice @ .., 4] => assert_eq!(subslice.len(), 1), // typeof(subslice) == &mut [i32]
72+
[5, subslice @ ..] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &mut [i32]
73+
[subslice @ .., 6] => assert_eq!(subslice.len(), 2), // typeof(subslice) == &mut [i32]
74+
[x, .., y] => assert!(v.len() >= 2),
75+
[..] => {} // Always matches
76+
}
77+
78+
// Matching slices by value (error):
79+
let mut v = vec![1, 2, 3];
80+
match v[..] {
81+
[x @ ..] => {} // ERROR cannot move out of type `[i32]`, a non-copy slice
82+
}
83+
84+
// Matching arrays by-value and by reference (explicitly or using default-binding-modes):
85+
let mut v = [1, 2, 3];
86+
match v {
87+
[1, subarray @ .., 3] => assert_eq!(subarray, [2]), // typeof(subarray) == [i32; 1]
88+
[5, subarray @ ..] => has_type::<[i32; 2]>(subarray), // typeof(subarray) == [i32; 2]
89+
[subarray @ .., 6] => has_type::<[i32, 2]>(subarray), // typeof(subarray) == [i32; 2]
90+
[x, .., y] => has_type::<i32>(x),
91+
[..] => {},
92+
}
93+
match v {
94+
[1, ref subarray @ .., 3] => assert_eq!(subarray, [2]), // typeof(subarray) == &[i32; 1]
95+
[5, ref subarray @ ..] => has_type::<&[i32; 2]>(subarray), // typeof(subarray) == &[i32; 2]
96+
[ref subarray @ .., 6] => has_type::<&[i32, 2]>(subarray), // typeof(subarray) == &[i32; 2]
97+
[x, .., y] => has_type::<&i32>(x),
98+
[..] => {},
99+
}
100+
match &mut v {
101+
[1, subarray @ .., 3] => assert_eq!(subarray, [2]), // typeof(subarray) == &mut [i32; 1]
102+
[5, subarray @ ..] => has_type::<&mut [i32; 2]>(subarray), // typeof(subarray) == &mut [i32; 2]
103+
[subarray @ .., 6] => has_type::<&mut [i32, 2]>(subarray), // typeof(subarray) == &mut [i32; 2]
104+
[x, .., y] => has_type::<&mut i32>(x),
105+
[..] => {},
106+
}
107+
```
108+
109+
# Reference-level explanation
110+
[reference-level-explanation]: #reference-level-explanation
111+
112+
`..` can be used as a pattern fragment for matching sub-slices and sub-arrays.
113+
114+
The fragment's syntax is:
115+
```
116+
SUBSLICE = .. | BINDING @ ..
117+
BINDING = ref? mut? IDENT
118+
```
119+
120+
The subslice fragment incorporates into the full subslice syntax in the same way as the `..`
121+
fragment incorporates into the stable tuple pattern syntax (with regards to allowed number of
122+
subslices, trailing commas, etc).
123+
124+
`@` can be used to bind the result of `..` to an identifier.
125+
126+
`..` is treated as a "non-reference-pattern" for the purpose of determining default-binding-modes,
127+
and so shifts the binding mode to by-`ref` or by-`ref mut` when used to match a subsection of a
128+
reference or mutable reference to a slice or array.
129+
130+
When used to match against a non-reference slice (`[u8]`), `x @ ..` would attempt to bind
131+
by-value, which would fail due a move from a non-copy type `[u8]`.
132+
133+
`..` is not a full pattern syntax, but rather a part of slice, tuple and tuple
134+
struct pattern syntaxes. In particular, `..` is not accepted by the `pat` macro matcher.
135+
`BINDING @ ..` is also not a full pattern syntax, but rather a part of slice pattern syntax, so
136+
it is not accepted by the `pat` macro matcher either.
137+
138+
# Drawbacks
139+
[drawbacks]: #drawbacks
140+
141+
None known.
142+
143+
# Rationale and alternatives
144+
[alternatives]: #alternatives
145+
146+
More complex syntaxes derived from `..` are possible, they use additional tokens to avoid the
147+
ambiguity with ranges, for example
148+
[`..PAT..`](https://github.com/rust-lang/rust/issues/23121#issuecomment-301485132), or
149+
[`.. @ PAT`](https://github.com/rust-lang/rust/issues/23121#issuecomment-280920062) or
150+
[`PAT @ ..`](https://github.com/rust-lang/rust/issues/23121#issuecomment-280906823), or other
151+
similar alternatives.
152+
We reject these syntaxes because they only bring benefits in contrived cases using a
153+
feature that doesn't even exist yet, but normally they only add symbolic noise.
154+
155+
More radical syntax changes do not keep consistency with `..`, for example
156+
[`[1, 2, 3, 4] ++ ref v`](https://github.com/rust-lang/rust/issues/23121#issuecomment-289220169).
157+
158+
### `..PAT` or `PAT..`
159+
160+
If `..` is used in the meaning "match the subslice (`>=0` elements) and ignore it", then it's
161+
reasonable to expect that syntax for "match the subslice to a pattern" should be some variation
162+
on `..`.
163+
The two simplest variations are `..PAT` and `PAT..`.
164+
165+
#### Ambiguity
166+
167+
The issue is that these syntaxes are ambiguous with half-bounded ranges `..END` and `BEGIN..`,
168+
and the full range `..`.
169+
To be precise, such ranges are not currently supported in patterns, but they may be supported in
170+
the future.
171+
172+
Syntactic ambiguity is not inherently bad. We see it every day in expressions like
173+
`a + b * c`. What is important is to disambiguate it reasonably by default and have a way to
174+
group operands in the alternative way when default disambiguation turns out to be incorrect.
175+
In case of slice patterns the subslice interpretation seems more likely, so we
176+
can take it as a default.
177+
There was very little demand for implementing half-bounded ranges in patterns so far
178+
(see https://github.com/rust-lang/rfcs/issues/947), but if they
179+
are implemented in the future they will be able to be used in slice patterns as well, but they
180+
could require explicit grouping with recently implemented
181+
[parentheses in patterns](https://github.com/rust-lang/rust/pull/48500) (`[a, (..end)]`) or an
182+
explicitly written start boundary (`[a, 0 .. end]`).
183+
We can also make *some* disambiguation effort and, for example, interpret `..LITERAL` as a
184+
range because `LITERAL` can never match a subslice. Time will show if such an effort is necessary
185+
or not.
186+
187+
If/when half-bounded ranges are supported in patterns, for better future compatibility we could
188+
decide to reserve `..PAT` as "rest of the list" in tuples and tuple structs as well, and avoid
189+
interpreting it as a range pattern in those positions.
190+
191+
Note that ambiguity with unbounded ranges as they are used in expressions (`..`) already exists in
192+
variant `Variant(..)` and tuple `(a, b, ..)` patterns, but it's unlikely that the `..` syntax
193+
will ever be used in patterns in the range meaning because it duplicates functionality of the
194+
wildcard pattern `_`.
195+
196+
#### `..PAT` vs `PAT..`
197+
198+
Originally Rust used syntax `..PAT` for subslice patterns.
199+
In 2014 the syntax was changed to `PAT..` by [RFC 202](https://github.com/rust-lang/rfcs/pull/202).
200+
That RFC received almost no discussion before it got merged and its motivation is no longer
201+
relevant because arrays now use syntax `[T; N]` instead of `[T, ..N]` used in old Rust.
202+
203+
This RFC originally proposed to switch back to `..PAT`.
204+
Some reasons to switch were:
205+
- Symmetry with expressions.
206+
One of the general ideas behind patterns is that destructuring with
207+
patterns has the same syntax as construction with expressions, if possible.
208+
In expressions we already have something with the meaning "rest of the list" - functional record
209+
update in struct expressions `S { field1, field2, ..remaining_fields }`.
210+
Right now we can use `S { field1, field1, .. }` in a pattern, but can't bind the remaining fields
211+
as a whole (by creating a new struct type on the fly, for example). It's not inconceivable that
212+
in Rust 2030 we have such ability and it's reasonable to expect it using syntax `..remaining_fields`
213+
symmetric to expressions. It would be good for slice patterns to be consistent with it.
214+
Without speculations, even if `..remaining_fields` in struct expressions and `..subslice` in slice
215+
patterns are not entirely the same thing, they are similar enough to keep them symmetric already.
216+
- Simple disambiguation.
217+
When we are parsing a slice pattern and see `..` we immediately know it's
218+
a subslice and can parse following tokens as a pattern (unless they are `,` or `]`, then it's just
219+
`..`, without an attached pattern).
220+
With `PAT..` we need to consume the pattern first, but that pattern may be a... `RANGE_BEGIN..`
221+
range pattern, then it means that we consumed too much and need to reinterpret the parsed tokens
222+
somehow. It's probably possible to make this work, but it's some headache that we would like to
223+
avoid if possible.
224+
225+
This RFC no longer includes the addition of `..PAT` or `PAT..`.
226+
The currently-proposed change is a minimal addition to patterns (`..` for slices) which
227+
already exists in other forms (e.g. tuples) and generalizes well to pattern-matching out sub-tuples,
228+
e.g. `let (a, b @ .., c) = (1, 2, 3, 4);`.
229+
230+
Additionally, `@` is more consistent with the types of patterns that would be allowable for matching
231+
slices (only identifiers), whereas `PAT..`/`..PAT` suggest the ability to write e.g. `..(1, x)` or
232+
`..SomeStruct { x }` sub-patterns, which wouldn't be possible since the resulting bound variables
233+
don't form a slice (since they're spread out in memory).
234+
235+
# Prior art
236+
[prior-art]: #prior-art
237+
238+
Some other languages like Haskell (`first_elem : rest_of_the_list`),
239+
Scala, or F# (`first_elem :: rest_of_the_list`) has list/array patterns, but their
240+
syntactic choices are quite different from Rust's general style.
241+
242+
"Rest of the list" in patterns was previously discussed in
243+
[RFC 1492](https://github.com/rust-lang/rfcs/pull/1492)
244+
245+
# Unresolved questions
246+
[unresolved]: #unresolved-questions
247+
248+
None known.
249+
250+
# Future possibilities
251+
[future-possibilities]: #future-possibilities
252+
253+
Turn `..` into a full pattern syntactically accepted in any pattern position,
254+
(including `pat` matchers in macros), but rejected semantically outside of slice and tuple patterns.

0 commit comments

Comments
 (0)