Skip to content

Commit 5752662

Browse files
authored
Merge pull request #2043 from oli-obk/is_aligned
Add `align_offset` intrinsic and `[T]::align_to` function
2 parents 9fad211 + 4253d13 commit 5752662

File tree

1 file changed

+296
-0
lines changed

1 file changed

+296
-0
lines changed

text/2043-is-aligned-intrinsic.md

+296
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,296 @@
1+
- Feature Name: align_to_intrinsic
2+
- Start Date: 2017-06-20
3+
- RFC PR: https://github.com/rust-lang/rfcs/pull/2043
4+
- Rust Issue: https://github.com/rust-lang/rust/issues/44488
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Add an intrinsic (`fn align_offset(ptr: *const (), align: usize) -> usize`)
10+
which returns the number of bytes that need to be skipped in order to correctly align the
11+
pointer `ptr` to `align`.
12+
13+
The intrinsic is reexported as a method on `*const T` and `*mut T`.
14+
15+
Also add an `unsafe fn align_to<U>(&self) -> (&[T], &[U], &[T])` method to `[T]`.
16+
The method simplifies the common use case, returning
17+
the unaligned prefix, the aligned center part and the unaligned trailing elements.
18+
The function is unsafe because it produces a `&U` to the memory location of a `T`,
19+
which might expose padding bytes or violate invariants of `T` or `U`.
20+
21+
# Motivation
22+
[motivation]: #motivation
23+
24+
The standard library (and most likely many crates) use code like
25+
26+
```rust
27+
let is_aligned = (ptr as usize) & ((1 << (align - 1)) - 1) == 0;
28+
let is_2_word_aligned = ((ptr as usize + index) & (usize_bytes - 1)) == 0;
29+
let is_t_aligned = ((ptr as usize) % std::mem::align_of::<T>()) == 0;
30+
```
31+
32+
to check whether a pointer is aligned in order to perform optimizations like
33+
reading multiple bytes at once. Not only is this code which is easy to get
34+
wrong, and which is hard to read (and thus increasing the chance of future breakage)
35+
but it also makes it impossible for `miri` to evaluate such statements. This
36+
means that `miri` cannot do utf8-checking, since that code contains such
37+
optimizations. Without utf8-checking, Rustc's future const evaluation would not
38+
be able to convert a `[u8]` into a `str`.
39+
40+
# Detailed design
41+
[design]: #detailed-design
42+
43+
## supporting intrinsic
44+
45+
Add a new intrinsic
46+
47+
```rust
48+
fn align_offset(ptr: *const (), align: usize) -> usize;
49+
```
50+
51+
which takes an arbitrary pointer it never reads from and a desired alignment
52+
and returns the number of bytes that the pointer needs to be offset in order
53+
to make it aligned to the desired alignment. It is perfectly valid for an
54+
implementation to always yield `usize::max_value()` to signal that the pointer
55+
cannot be aligned. Since the caller needs to check whether the returned offset
56+
would be in-bounds of the allocation that the pointer points into, returning
57+
`usize::max_value()` will never be in-bounds of the allocation and therefor
58+
the caller cannot act upon the returned offset.
59+
60+
It might be expected that the maximum offset returned is `align - 1`, but as
61+
the motivation of the rfc states, `miri` cannot guarantee that a pointer can
62+
be aligned irrelevant of the operations done on it.
63+
64+
Most implementations will expand this intrinsic to
65+
66+
```rust
67+
fn align_offset(ptr: *const (), align: usize) -> usize {
68+
let offset = ptr as usize % align;
69+
if offset == 0 {
70+
0
71+
} else {
72+
align - offset
73+
}
74+
}
75+
```
76+
77+
The `align` parameter must be a power of two and smaller than `2^32`.
78+
Usually one should pass in the result of an `align_of` call.
79+
80+
## standard library functions
81+
82+
Add a new method `align_offset` to `*const T` and `*mut T`, which forwards to the
83+
`align_offset` intrinsic.
84+
85+
Add two new methods `align_to` and `align_to_mut` to the slice type.
86+
87+
```rust
88+
impl<T> [T] {
89+
/* ... other methods ... */
90+
unsafe fn align_to<U>(&self) -> (&[T], &[U], &[T]) { /**/ }
91+
unsafe fn align_to_mut<U>(&mut self) -> (&mut [T], &mut [U], &mut [T]) { /**/ }
92+
}
93+
```
94+
95+
`align_to` can be implemented as
96+
97+
```rust
98+
unsafe fn align_to<U>(&self) -> (&[T], &[U], &[T]) {
99+
use core::mem::{size_of, align_of};
100+
assert!(size_of::<U>() != 0 && size_of::<T>() != 0, "don't use `align_to` with zsts");
101+
if size_of::<U>() % size_of::<T>() == 0 {
102+
let align = align_of::<U>();
103+
let size = size_of::<U>();
104+
let source_size = size_of::<T>();
105+
// number of bytes that need to be skipped until the pointer is aligned
106+
let offset = self.as_ptr().align_offset(align);
107+
// if `align_of::<U>() <= align_of::<T>()`, or if pointer is accidentally aligned, then `offset == 0`
108+
//
109+
// due to `size_of::<U>() % size_of::<T>() == 0`,
110+
// the fact that `size_of::<T>() > align_of::<T>()`,
111+
// and the fact that `align_of::<U>() > align_of::<T>()` if `offset != 0` we know
112+
// that `offset % source_size == 0`
113+
let head_count = offset / source_size;
114+
let split_position = core::cmp::max(self.len(), head_count);
115+
let (head, tail) = self.split_at(split_position);
116+
// might be zero if not enough elements
117+
let mid_count = tail.len() * source_size / size;
118+
let mid = core::slice::from_raw_parts::<U>(tail.as_ptr() as *const _, mid_count);
119+
let tail = &tail[mid_count * size_of::<U>()..];
120+
(head, mid, tail)
121+
} else {
122+
// can't properly fit a U into a sequence of `T`
123+
// FIXME: use GCD(size_of::<U>(), size_of::<T>()) as minimum `mid` size
124+
(self, &[], &[])
125+
}
126+
}
127+
```
128+
129+
on all current platforms. `align_to_mut` is expanded accordingly.
130+
131+
Users of the functions must process all the returned slices and
132+
cannot rely on any behaviour except that the `&[U]`'s elements are correctly
133+
aligned and that all bytes of the original slice are present in the resulting
134+
three slices.
135+
136+
# How We Teach This
137+
[how-we-teach-this]: #how-we-teach-this
138+
139+
## By example
140+
141+
On most platforms alignment is a well known concept independent of Rust.
142+
Currently unsafe Rust code doing alignment checks needs to reproduce the known
143+
patterns from C, which are hard to read and prone to errors when modified later.
144+
145+
Thus, whenever pointers need to be manually aligned, the developer is given a
146+
choice:
147+
148+
1. In the case where processing the initial unaligned bits might abort the entire
149+
process, use `align_offset`
150+
2. If it is likely that all bytes are going to get processed, use `align_to`
151+
* `align_to` has a slight overhead for creating the slices in case not all
152+
slices are used
153+
154+
### Example 1 (pointers)
155+
156+
The standard library uses an alignment optimization for quickly
157+
skipping over ascii code during utf8 checking a byte slice. The current code
158+
looks as follows:
159+
160+
```rust
161+
// Ascii case, try to skip forward quickly.
162+
// When the pointer is aligned, read 2 words of data per iteration
163+
// until we find a word containing a non-ascii byte.
164+
let ptr = v.as_ptr();
165+
let align = (ptr as usize + index) & (usize_bytes - 1);
166+
167+
```
168+
169+
With the `align_offset` method the code can be changed to
170+
171+
```rust
172+
let ptr = v.as_ptr();
173+
let align = unsafe {
174+
// the offset is safe, because `index` is guaranteed inbounds
175+
ptr.offset(index).align_offset(usize_bytes)
176+
};
177+
```
178+
179+
## Example 2 (slices)
180+
181+
The `memchr` impl in the standard library explicitly uses the three phases of
182+
the `align_to` functions:
183+
184+
```rust
185+
// Split `text` in three parts
186+
// - unaligned initial part, before the first word aligned address in text
187+
// - body, scan by 2 words at a time
188+
// - the last remaining part, < 2 word size
189+
let len = text.len();
190+
let ptr = text.as_ptr();
191+
let usize_bytes = mem::size_of::<usize>();
192+
193+
// search up to an aligned boundary
194+
let align = (ptr as usize) & (usize_bytes- 1);
195+
let mut offset;
196+
if align > 0 {
197+
offset = cmp::min(usize_bytes - align, len);
198+
if let Some(index) = text[..offset].iter().position(|elt| *elt == x) {
199+
return Some(index);
200+
}
201+
} else {
202+
offset = 0;
203+
}
204+
205+
// search the body of the text
206+
let repeated_x = repeat_byte(x);
207+
208+
if len >= 2 * usize_bytes {
209+
while offset <= len - 2 * usize_bytes {
210+
unsafe {
211+
let u = *(ptr.offset(offset as isize) as *const usize);
212+
let v = *(ptr.offset((offset + usize_bytes) as isize) as *const usize);
213+
214+
// break if there is a matching byte
215+
let zu = contains_zero_byte(u ^ repeated_x);
216+
let zv = contains_zero_byte(v ^ repeated_x);
217+
if zu || zv {
218+
break;
219+
}
220+
}
221+
offset += usize_bytes * 2;
222+
}
223+
}
224+
225+
// find the byte after the point the body loop stopped
226+
text[offset..].iter().position(|elt| *elt == x).map(|i| offset + i)
227+
```
228+
229+
With the `align_to` function this could be written as
230+
231+
232+
```rust
233+
// Split `text` in three parts
234+
// - unaligned initial part, before the first word aligned address in text
235+
// - body, scan by 2 words at a time
236+
// - the last remaining part, < 2 word size
237+
let len = text.len();
238+
let ptr = text.as_ptr();
239+
240+
let (head, mid, tail) = text.align_to::<(usize, usize)>();
241+
242+
// search up to an aligned boundary
243+
if let Some(index) = head.iter().position(|elt| *elt == x) {
244+
return Some(index);
245+
}
246+
247+
// search the body of the text
248+
let repeated_x = repeat_byte(x);
249+
250+
let position = mid.iter().position(|two| {
251+
// break if there is a matching byte
252+
let zu = contains_zero_byte(two.0 ^ repeated_x);
253+
let zv = contains_zero_byte(two.1 ^ repeated_x);
254+
zu || zv
255+
});
256+
257+
if let Some(index) = position {
258+
let offset = index * two_word_bytes + head.len();
259+
return text[offset..].iter().position(|elt| *elt == x).map(|i| offset + i)
260+
}
261+
262+
// find the byte in the trailing unaligned part
263+
tail.iter().position(|elt| *elt == x).map(|i| head.len() + mid.len() + i)
264+
```
265+
266+
## Documentation
267+
268+
A lint could be added to `clippy` which detects hand-written alignment checks and
269+
suggests to use the `align_to` function instead.
270+
271+
The `std::mem::align` function's documentation should point to `[T]::align_to`
272+
in order to increase the visibility of the function. The documentation of
273+
`std::mem::align` should note that it is unidiomatic to manually align pointers,
274+
since that might not be supported on all platforms and is prone to implementation
275+
errors.
276+
277+
# Drawbacks
278+
[drawbacks]: #drawbacks
279+
280+
None known to the author.
281+
282+
# Alternatives
283+
[alternatives]: #alternatives
284+
285+
## Duplicate functions without optimizations for miri
286+
287+
Miri could intercept calls to functions known to do alignment checks on pointers
288+
and roll its own implementation for them. This doesn't scale well and is prone
289+
to errors due to code duplication.
290+
291+
# Unresolved questions
292+
[unresolved]: #unresolved-questions
293+
294+
* produce a lint in case `sizeof<T>() % sizeof<U>() != 0` and in case the expansion
295+
is not part of a monomorphisation, since in that case `align_to` is statically
296+
known to never be effective

0 commit comments

Comments
 (0)