Skip to content

Commit e78c51a

Browse files
committed
Auto merge of #50398 - llogiq:memchr-nano-opt, r=nagisa
nano-optimization for memchr::repeat_byte This replaces the multiple shifts & bitwise or with a single multiplication In my benchmarks this performs equally well or better, especially on 64bit systems (it shaves a stable nanosecond on my skylake). This may go against conventional wisdom, but the shifts and bitwise ors cannot be pipelined because of hard data dependencies. While it may or may not be worthwile from an optimization standpoint, it also reduces code size, so there's basically no downside.
2 parents 841e0cc + 1cefb5c commit e78c51a

File tree

1 file changed

+2
-13
lines changed

1 file changed

+2
-13
lines changed

src/libcore/slice/memchr.rs

+2-13
Original file line numberDiff line numberDiff line change
@@ -39,21 +39,10 @@ fn repeat_byte(b: u8) -> usize {
3939
(b as usize) << 8 | b as usize
4040
}
4141

42-
#[cfg(target_pointer_width = "32")]
42+
#[cfg(not(target_pointer_width = "16"))]
4343
#[inline]
4444
fn repeat_byte(b: u8) -> usize {
45-
let mut rep = (b as usize) << 8 | b as usize;
46-
rep = rep << 16 | rep;
47-
rep
48-
}
49-
50-
#[cfg(target_pointer_width = "64")]
51-
#[inline]
52-
fn repeat_byte(b: u8) -> usize {
53-
let mut rep = (b as usize) << 8 | b as usize;
54-
rep = rep << 16 | rep;
55-
rep = rep << 32 | rep;
56-
rep
45+
(b as usize) * (::usize::MAX / 255)
5746
}
5847

5948
/// Return the first index matching the byte `x` in `text`.

0 commit comments

Comments
 (0)