Skip to content

Commit b82a84c

Browse files
committed
RFC: Trait for !Sized thin pointers
1 parent 845d609 commit b82a84c

File tree

1 file changed

+289
-0
lines changed

1 file changed

+289
-0
lines changed

text/0000-impl-dyn-sized.md

+289
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
- Feature Name: `impl_dyn_sized`
2+
- Start Date: 2023-11-29
3+
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
4+
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Enable user code to define dynamically-sized thin pointers. Such types are
10+
`!Sized`, but references to them are pointer-sized (i.e. not "fat pointers").
11+
The implementation of [`core::mem::size_of_val()`](size_of_val) delegates to
12+
a new `core::mem::DynSized` trait at runtime.
13+
14+
[size_of_val]: https://doc.rust-lang.org/core/mem/fn.size_of_val.html
15+
16+
# Motivation
17+
[motivation]: #motivation
18+
19+
Enable ergonomic and efficient references to dynamically-sized values that
20+
are capable of computing their own size.
21+
22+
It should be possible to declare a Rust type that is `!Sized`, but has
23+
references that are pointer-sized and therefore only require a single register
24+
on most architectures.
25+
26+
In particular this RFC aims to support a common pattern in other low-level
27+
languages, such as C, where a value may consist of a fixed-layout header
28+
followed by dynamically-sized data:
29+
30+
```c
31+
struct __attribute__((aligned(8))) request {
32+
uint32_t size;
33+
uint16_t id;
34+
uint16_t flags;
35+
/* uint8_t request_data[]; */
36+
};
37+
38+
void handle_request(struct request *req) { /* ... */ }
39+
```
40+
41+
This pattern is used frequently in zero-copy APIs that transmit structured data
42+
between trust boundaries.
43+
44+
# Background
45+
[motivation]: #motivation
46+
47+
There are currently two approved RFCs that cover similar functionality:
48+
* [RFC 1861] adds `extern type` for declaring types that are opaque to Rust's
49+
type system. One of the capabilities available to extern types is that they
50+
can be embedded into a `struct` as the last field, and that `struct` will
51+
become an unsized type with thin references.
52+
53+
Stabilizing `extern type` is currently blocked on questions of how to handle
54+
Rust layout intrinsics such as [`core::mem::size_of_val()`](size_of_val) and
55+
[`core::mem::align_of_val()`](align_of_val) for fully opaque types.
56+
57+
* [RFC 2580] adds traits and intrinsics for custom DSTs either with or without
58+
associated "fat pointer" metadata. A custom DST with thin references can be
59+
represented as `Pointee<Metadata = ()>`.
60+
61+
Stabilizing custom DSTs is currently blocked on multiple questions involving
62+
the content and representation of complex metadata, such as `&dyn` vtables.
63+
64+
In both of these cases the ability to declare custom DSTs with thin references
65+
is a minor footnote to the overall feature, and stabilization is blocked by
66+
issues unrelated to thin-pointer DSTs.
67+
68+
The objective of this RFC is to extract custom thin-pointer DSTs into its own
69+
feature, which would hopefully be free of known issues and could be stabilized
70+
without significant changes to the compiler or ecosystem.
71+
72+
[RFC 1861]: https://rust-lang.github.io/rfcs/1861-extern-types.html
73+
[RFC 2580]: https://rust-lang.github.io/rfcs/2580-ptr-meta.html
74+
75+
[align_of_val]: https://doc.rust-lang.org/core/mem/fn.align_of_val.html
76+
77+
# Guide-level explanation
78+
[guide-level-explanation]: #guide-level-explanation
79+
80+
The unsafe trait `core::mem::DynSized` may be implemented for a `!Sized` type
81+
to configure how the size of a value is computed from a reference. References
82+
to a type that implements `DynSized` are not required to store the value size
83+
as pointer metadata.
84+
85+
If a type that implements `DynSized` has no other associated pointer metadata
86+
(such as a vtable), then references to that type will have the same size and
87+
layout as a normal pointer.
88+
89+
```rust
90+
#[repr(C, align(8))]
91+
struct Request {
92+
size: u32,
93+
id: u16,
94+
flags: u16,
95+
data: [u8],
96+
}
97+
98+
unsafe impl core::mem::DynSized for Request {
99+
fn size_of_val(&self) -> usize {
100+
usize::try_from(self.size).unwrap_or(usize::MAX)
101+
}
102+
}
103+
104+
// size_of::<&Request>() == size_of::<*const ()>()
105+
```
106+
107+
The `DynSized` trait has a single required method, `size_of_val()`, which
108+
has the same semantics as `core::mem::size_of_val()`.
109+
110+
```rust
111+
// core::mem
112+
pub unsafe trait DynSized {
113+
// Returns the size of the pointed-to value in bytes.
114+
fn size_of_val(&self) -> usize;
115+
}
116+
```
117+
118+
It is an error to `impl DynSized` for a type that is `Sized`. In other words,
119+
the following code is invalid:
120+
121+
```rust
122+
#[repr(C, align(8))]
123+
struct SizedRequest {
124+
size: u32,
125+
id: u16,
126+
flags: u16,
127+
data: [u8; 1024],
128+
}
129+
130+
// Compiler error: `impl DynSized` on a type that isn't `!Sized`.
131+
unsafe impl core::mem::DynSized for SizedRequest {
132+
fn size_of_val(&self) -> usize {
133+
usize::try_from(self.size).unwrap_or(usize::MAX)
134+
}
135+
}
136+
```
137+
138+
# Reference-level explanation
139+
[reference-level-explanation]: #reference-level-explanation
140+
141+
The `core::mem::DynSized` trait acts as a signal to the compiler that the
142+
size of a value can be computed dynamically by the user-provided trait
143+
implementation. If references to that type would otherwise be of the layout
144+
`(ptr, usize)` due to being `!Sized`, then they can be reduced to `ptr`.
145+
146+
The `DynSized` trait does not _guarantee_ that a type will have thin pointers,
147+
it merely enables it. This definition is intended to be compatible with RFC
148+
2580, in that types with complex pointer metadata would continue to have fat
149+
pointers. Such types may choose to implement `DynSized` by extracting their
150+
custom pointer metadata from `&self`.
151+
152+
Implementing `DynSized` does not affect alignment, so the questions of how to
153+
handle unknown alignments of RFC 1861 `extern type` DSTs do not apply.
154+
155+
In current Rust, a DST used as a `struct` field must be the final field of the
156+
`struct`. This restriction remains unchanged, as the offsets of any fields after
157+
a DST would be impossible to compute statically.
158+
- This also implies that any given `struct` may have at most one field that
159+
implements `DynSized`.
160+
161+
A `struct` with a field that implements `DynSized` will also implicitly
162+
implement `DynSized`. The implicit implementation of `DynSized` computes the
163+
size of the struct up until the `DynSized` field, and then adds the result of
164+
calling `DynSized::size_of_val()` on the final field.
165+
- This implies it's not permitted to maually `impl DynSize` for a type that
166+
contains a field that implements `DynSize`.
167+
168+
# Drawbacks
169+
[drawbacks]: #drawbacks
170+
171+
## Mutability of value sizes
172+
173+
If the size of a value is stored in the value itself, then that implies it can
174+
change at runtime.
175+
176+
```rust
177+
struct MutableSize { size: usize }
178+
unsafe impl core::mem::DynSized for MutableSize {
179+
fn size_of_val(&self) -> usize { self.size }
180+
}
181+
182+
let mut v = MutableSize { size: 8 };
183+
println!("{:?}", core::mem::size_of_val(&v)); // prints "8"
184+
v.size = 16;
185+
println!("{:?}", core::mem::size_of_val(&v)); // prints "16"
186+
```
187+
188+
There may be existing code that assumes `size_of_val()` is constant for a given
189+
value, which is true in today's Rust due to the nature of fat pointers, but
190+
would no longer be true if `size_of_val()` is truly dynamic.
191+
192+
Alternatively, the API contract for `DynSized` implementations could require
193+
that the result of `size_of_val()` not change for the lifetime of the allocated
194+
object. This would likely be true for nearly all interesting use cases, and
195+
would let `DynSized` values be stored in a `Box`.
196+
197+
## Compatibility with existing fat-pointer DSTs
198+
199+
It may be desirable for certain existing stabilized DSTs to implement
200+
`DynSized` -- for example, it is a natural fit for the planned redefinition of
201+
[`&core::ffi::CStr`](cstr) as a thin pointer.
202+
203+
[cstr]: https://doc.rust-lang.org/core/ffi/struct.CStr.html
204+
205+
Such a change to existing types might be backwards-incompatible for code that
206+
embeds those types as a `struct` field, because it would change the reference
207+
layout. For example, the following code compiles in stable Rust v1.73 but would
208+
be a compilation error if `&CStr` does not have the same layout as `&[u8]`.
209+
210+
```rust
211+
struct ContainsCStr {
212+
cstr: core::ffi::CStr,
213+
}
214+
impl ContainsCStr {
215+
fn as_bytes(&self) -> &[u8] {
216+
unsafe { core::mem::transmute(self) }
217+
}
218+
}
219+
```
220+
221+
The above incompatibility of a redefined `&CStr` exists regardless of this RFC,
222+
but it's worth noting that implementing `DynSized` would be a backwards
223+
incompatible change for existing DSTs.
224+
225+
# Rationale and alternatives
226+
[rationale-and-alternatives]: #rationale-and-alternatives
227+
228+
This design is less generic than some of the alternatives (including custom DSTs
229+
and extern types), but has the advantage being much more tightly scoped and
230+
therefore is expected to have no major blockers. It directly addresses one of
231+
the pain points for use of Rust in a low-level performance-sensitive codebase,
232+
while avoiding large-scale language changes to the extent possible.
233+
234+
Without this change, people will continue to either use thick-pointer DSTs
235+
(reducing performance relative to C), or write Rust types that claim to be
236+
`Sized` but actually aren't (the infamous `_data: [u8; 0]` hack).
237+
238+
# Prior art
239+
[prior-art]: #prior-art
240+
241+
The canonical prior art is the C language idiom of a `struct` that's implicitly
242+
followed by a dynamically-sized value. This idiom was standardized in C99 under
243+
the term "flexible array member":
244+
245+
> As a special case, the last element of a structure with more than one named
246+
> member may have an incomplete array type; this is called a flexible array
247+
> member. [...] However, when a `.` (or `->`) operator has a left operand that
248+
> is (a pointer to) a structure with a flexible array member and the right
249+
> operand names that member, it behaves as if that member were replaced with the
250+
> longest array (with the same element type) that would not make the structure
251+
> larger than the object being accessed;
252+
253+
The use of flexible array members (either with C99 syntax or not) is widespread
254+
in C APIs, especially when sending structured data between processes ([IPC]) or
255+
between a process and the kernel. For example, the Linux kernel's [FUSE]
256+
protocol communicates with userspace via length-prefixed dynamically-sized
257+
request/response buffers.
258+
259+
They're also common when implementing low-level network protocols, which have
260+
length-delimited frames comprising a fixed-layout header followed by a variable
261+
amount of payload data.
262+
263+
[IPC]: https://en.wikipedia.org/wiki/Inter-process_communication
264+
[FUSE]: https://www.kernel.org/doc/html/v6.3/filesystems/fuse.html
265+
266+
In the context of Rust, the two RFCs mentioned earlier both cover thin-pointer
267+
DSTs as part of their more general extensions to the Rust type system:
268+
- [[RFC #1861] `extern_types`](https://rust-lang.github.io/rfcs/1861-extern-types.html)
269+
- [[RFC #2580] `ptr_metadata`](https://rust-lang.github.io/rfcs/2580-ptr-meta.html)
270+
271+
Also, there have been non-approved RFC proposals involving thin-pointer DSTs:
272+
- [[rfcs/pull#709] truly unsized types](https://github.com/rust-lang/rfcs/pull/709)
273+
- [[rfcs/pull#1524] Custom Dynamically Sized Types](https://github.com/rust-lang/rfcs/pull/1524)
274+
- [[rfcs/pull#2255] More implicit bounds (?Sized, ?DynSized, ?Move)](https://github.com/rust-lang/rfcs/issues/2255)
275+
276+
277+
278+
279+
# Unresolved questions
280+
[unresolved-questions]: #unresolved-questions
281+
282+
None so far
283+
284+
# Future possibilities
285+
[future-possibilities]: #future-possibilities
286+
287+
None so far. Further exploration of opaque types and/or custom pointer metadata
288+
already has separate dedicated RFCs. This one is just to get an MVP for types
289+
that should be `!Sized` without fat pointers.

0 commit comments

Comments
 (0)