Skip to content

Commit 16ea7f6

Browse files
authored
Merge pull request #2229 from samsartor/master
Closures Capture Disjoint Fields
2 parents 5d4b752 + 395ced4 commit 16ea7f6

File tree

1 file changed

+312
-0
lines changed

1 file changed

+312
-0
lines changed

text/2229-capture-disjoint-fields.md

+312
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,312 @@
1+
- Feature Name: `capture_disjoint_fields`
2+
- Start Date: 2017-11-28
3+
- RFC PR: [rust-lang/rfcs#2229](https://github.com/rust-lang/rfcs/pull/2229)
4+
- Rust Issue: [rust-lang/rust#53488](https://github.com/rust-lang/rust/issues/53488)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
This RFC proposes that closure capturing should be minimal rather than maximal.
10+
Conceptually, existing rules regarding borrowing and moving disjoint fields
11+
should be applied to capturing. If implemented, the following code examples
12+
would become valid:
13+
14+
```rust
15+
let a = &mut foo.a;
16+
|| &mut foo.b; // Error! cannot borrow `foo`
17+
somefunc(a);
18+
```
19+
20+
```rust
21+
let a = &mut foo.a;
22+
move || foo.b; // Error! cannot move `foo`
23+
somefunc(a);
24+
```
25+
26+
Note that some discussion of this has already taken place:
27+
- rust-lang/rust#19004
28+
- [Rust internals forum](https://internals.rust-lang.org/t/borrow-the-full-stable-name-in-closures-for-ergonomics/5387)
29+
30+
# Motivation
31+
[motivation]: #motivation
32+
33+
In the rust language today, any variables named within a closure will be fully
34+
captured. This was simple to implement but is inconsistent with the rest of the
35+
language because rust normally allows simultaneous borrowing of disjoint
36+
fields. Remembering this exception adds to the mental burden of the programmer
37+
and makes the rules of borrowing and ownership harder to learn.
38+
39+
The following is allowed; why should closures be treated differently?
40+
41+
```rust
42+
let _a = &mut foo.a;
43+
loop { &mut foo.b; } // ok!
44+
```
45+
46+
This is a particularly annoying problem because closures often need to borrow
47+
data from `self`:
48+
49+
```rust
50+
pub fn update(&mut self) {
51+
// cannot borrow `self` as immutable because `self.list` is also borrowed as mutable
52+
self.list.retain(|i| self.filter.allowed(i));
53+
}
54+
```
55+
56+
# Guide-level explanation
57+
[guide-level-explanation]: #guide-level-explanation
58+
59+
Rust understands structs sufficiently to know that it's possible
60+
to borrow disjoint fields of a struct simultaneously. Structs can also be
61+
destructed and moved piece-by-piece. This functionality should be available
62+
anywhere, including from within closures:
63+
64+
```rust
65+
struct OneOf {
66+
text: String,
67+
of: Vec<String>,
68+
}
69+
70+
impl OneOf {
71+
pub fn matches(self) -> bool {
72+
// Ok! destructure self
73+
self.of.into_iter().any(|s| s == self.text)
74+
}
75+
76+
pub fn filter(&mut self) {
77+
// Ok! mutate and inspect self
78+
self.of.retain(|s| s != &self.text)
79+
}
80+
}
81+
```
82+
83+
Rust will prevent dangerous double usage:
84+
85+
```rust
86+
struct FirstDuplicated(Vec<String>)
87+
88+
impl FirstDuplicated {
89+
pub fn first_count(self) -> usize {
90+
// Error! can't destructure and mutate same data
91+
self.0.into_iter()
92+
.filter(|s| &s == &self.0[0])
93+
.count()
94+
}
95+
96+
pub fn remove_first(&mut self) {
97+
// Error! can't mutate and inspect same data
98+
self.0.retain(|s| s != &self.0[0])
99+
}
100+
}
101+
```
102+
103+
# Reference-level explanation
104+
[reference-level-explanation]: #reference-level-explanation
105+
106+
This RFC does not propose any changes to the borrow checker. Instead, the MIR
107+
generation for closures should be altered to produce the minimal capture.
108+
Additionally, a hidden `repr` for closures might be added, which could reduce
109+
closure size through awareness of the new capture rules *(see unresolved)*.
110+
111+
In a sense, when a closure is lowered to MIR, a list of "capture expressions" is
112+
created, which we will call the "capture set". Each expression is some part of
113+
the closure body which, in order to capture parts of the enclosing scope, must
114+
be pre-evaluated when the closure is created. The output of the expressions,
115+
which we will call "capture data", is stored in the anonymous struct which
116+
implements the `Fn*` traits. If a binding is used within a closure, at least one
117+
capture expression which borrows or moves that binding's value must exist in the
118+
capture set.
119+
120+
Currently, lowering creates exactly one capture expression for each used
121+
binding, which borrows or moves the value in its entirety. This RFC proposes
122+
that lowering should instead create the minimal capture, where each expression
123+
is as precise as possible.
124+
125+
This minimal set of capture expressions *might* be created through a sort of
126+
iterative refinement. We would start out capturing all of the local variables.
127+
Then, each path would be made more precise by adding additional dereferences and
128+
path components depending on which paths are used and how. References to structs
129+
would be made more precise by reborrowing fields and owned structs would be made
130+
more precise by moving fields.
131+
132+
A capture expression is minimal if it produces a value that is used by the
133+
closure in its entirety (e.g. is a primitive, is passed outside the closure,
134+
etc.) or if making the expression more precise would require one the following.
135+
136+
- a call to an impure function
137+
- an illegal move (for example, out of a `Drop` type)
138+
139+
When generating a capture expression, we must decide if the output should be
140+
owned or if it can be a reference. In a non-`move` closure, a capture expression
141+
will *only* produce owned data if ownership of that data is required by the body
142+
of the closure. A `move` closure will *always* produce owned data unless the
143+
captured binding does not have ownership.
144+
145+
Note that *all* functions are considered impure (including to overloaded deref
146+
implementations). And, for the sake of capturing, all indexing is considered
147+
impure. It is possible that overloaded `Deref::deref` implementations could be
148+
marked as pure by using a new, marker trait (such as `DerefPure`) or attribute
149+
(such as `#[deref_transparent]`). However, such a solution should be proposed in
150+
a separate RFC. In the meantime, `<Box as Deref>::deref` could be a special case
151+
of a pure function *(see unresolved)*.
152+
153+
Also note that, because capture expressions are all subsets of the closure body,
154+
this RFC does not change *what* is executed. It does change the order/number of
155+
executions for some operations, but since these must be pure, order/repetition
156+
does not matter. Only changes to lifetimes might be breaking. Specifically, the
157+
drop order of uncaptured data can be altered.
158+
159+
We might solve this by considering a struct to be minimal if it contains unused
160+
fields that implement `Drop`. This would prevent the drop order of those fields
161+
from changing, but feels strange and non-orthogonal *(see unresolved)*.
162+
Encountering this case at all could trigger a warning, so that this extra rule
163+
could exist temporarily but be removed over the next epoc *(see unresolved)*.
164+
165+
## Reference Examples
166+
167+
Below are examples of various closures and their capture sets.
168+
169+
```rust
170+
let foo = 10;
171+
|| &mut foo;
172+
```
173+
174+
- `&mut foo` (primitive, ownership not required, used in entirety)
175+
176+
```rust
177+
let a = &mut foo.a;
178+
|| (&mut foo.b, &mut foo.c);
179+
somefunc(a);
180+
```
181+
182+
- `&mut foo.b` (ownership not required, used in entirety)
183+
- `&mut foo.c` (ownership not required, used in entirety)
184+
185+
The borrow checker passes because `foo.a`, `foo.b`, and `foo.c` are disjoint.
186+
187+
```rust
188+
let a = &mut foo.a;
189+
move || foo.b;
190+
somefunc(a);
191+
```
192+
193+
- `foo.b` (ownership available, used in entirety)
194+
195+
The borrow checker passes because `foo.a` and `foo.b` are disjoint.
196+
197+
```rust
198+
let hello = &foo.hello;
199+
move || foo.drop_world.a;
200+
somefunc(hello);
201+
```
202+
203+
- `foo.drop_world` (ownership available, can't be more precise without moving
204+
out of `Drop`)
205+
206+
The borrow checker passes because `foo.hello` and `foo.drop_world` are disjoint.
207+
208+
```rust
209+
|| println!("{}", foo.wrapper_thing.a);
210+
```
211+
212+
- `&foo.wrapper_thing` (ownership not required, can't be more precise because
213+
overloaded `Deref` on `wrapper_thing` is impure)
214+
215+
```rust
216+
|| foo.list[0];
217+
```
218+
219+
- `foo.list` (ownership required, can't be more precise because indexing is
220+
impure)
221+
222+
```rust
223+
let bar = (1, 2); // struct
224+
|| myfunc(bar);
225+
```
226+
227+
- `bar` (ownership required, used in entirety)
228+
229+
```rust
230+
let foo_again = &mut foo;
231+
|| &mut foo.a;
232+
somefunc(foo_again);
233+
```
234+
235+
- `&mut foo.a` (ownership not required, used in entirety)
236+
237+
The borrow checker fails because `foo_again` and `foo.a` intersect.
238+
239+
```rust
240+
let _a = foo.a;
241+
|| foo.a;
242+
```
243+
244+
- `foo.a` (ownership required, used in entirety)
245+
246+
The borrow checker fails because `foo.a` has already been moved.
247+
248+
```rust
249+
let a = &drop_foo.a;
250+
move || drop_foo.b;
251+
somefunc(a);
252+
```
253+
254+
- `drop_foo` (ownership available, can't be more precise without moving out of
255+
`Drop`)
256+
257+
The borrow checker fails because `drop_foo` cannot be moved while borrowed.
258+
259+
```rust
260+
|| &box_foo.a;
261+
```
262+
263+
- `&<Box<_> as Deref>::deref(&box_foo).b` (ownership not required, `Box::deref` is pure)
264+
265+
```rust
266+
move || &box_foo.a;
267+
```
268+
269+
- `box_foo` (ownership available, can't be more precise without moving out of
270+
`Drop`)
271+
272+
```rust
273+
let foo = &mut a;
274+
let other = &mut foo.other;
275+
move || &mut foo.bar;
276+
somefunc(other);
277+
```
278+
279+
- `&mut foo.bar` (ownership *not* available, borrow can be split)
280+
281+
282+
# Drawbacks
283+
[drawbacks]: #drawbacks
284+
285+
This RFC does ruin the intuition that all variables named within a closure are
286+
*completely* captured. I argue that that intuition is not common or necessary
287+
enough to justify the extra glue code.
288+
289+
# Rationale and alternatives
290+
[alternatives]: #alternatives
291+
292+
This proposal is purely ergonomic since there is a complete and common
293+
workaround. The existing rules could remain in place and rust users could
294+
continue to pre-borrow/move fields. However, this workaround results in
295+
significant useless glue code when borrowing many but not all of the fields in
296+
a struct. It also produces a larger closure than necessary which could make the
297+
difference when inlining.
298+
299+
# Unresolved questions
300+
[unresolved]: #unresolved-questions
301+
302+
- How to optimize pointers. Can borrows that all reference parts of the same
303+
object be stored as a single pointer? How should this optimization be
304+
implemented (e.g. a special `repr`, refinement typing)?
305+
306+
- How to signal that a function is pure. Is this even needed/wanted? Any other
307+
places where the language could benefit?
308+
309+
- Should `Box` be special?
310+
311+
- Drop order can change as a result of this RFC, is this a real stability
312+
problem? How should this be resolved?

0 commit comments

Comments
 (0)