-
Notifications
You must be signed in to change notification settings - Fork 21
Add Arm's NEON vectorization #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Please understand that current implementation only supports AVX2 and SSE2, therefore it is impossible to enable by default, as there is no NEON implementation Now for matter of default in general Problem is that when I started this library NEON support in Rust's std was lacking and I'm not sure if they filled gaps yet to implement it |
Most of features supported by LLVM has been implemented. Remaining unsupported features has not been implemented in LLVM as far as I understood the thread. Documentation also describes many neon instructions, some of them available since Rust 1.59.0 https://doc.rust-lang.org/core/arch/arm/index.html |
I have macOS M2 laptop: $ rustc --print cfg
debug_assertions
panic="unwind"
target_arch="aarch64"
target_endian="little"
target_env=""
target_family="unix"
target_feature="aes"
target_feature="crc"
target_feature="dit"
target_feature="dotprod"
target_feature="dpb"
target_feature="dpb2"
target_feature="fcma"
target_feature="fhm"
target_feature="flagm"
target_feature="fp16"
target_feature="frintts"
target_feature="jsconv"
target_feature="lor"
target_feature="lse"
target_feature="neon"
target_feature="paca"
target_feature="pacg"
target_feature="pan"
target_feature="pmuv3"
target_feature="ras"
target_feature="rcpc"
target_feature="rcpc2"
target_feature="rdm"
target_feature="sb"
target_feature="sha2"
target_feature="sha3"
target_feature="ssbs"
target_feature="vh"
target_has_atomic="128"
target_has_atomic="16"
target_has_atomic="32"
target_has_atomic="64"
target_has_atomic="8"
target_has_atomic="ptr"
target_os="macos"
target_pointer_width="64"
target_vendor="apple"
unix |
my test: Cargo.toml: [package]
name = "public-id"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
base64 = "0.21.7"
uuid = { version = "1.7.0", features = ["v4", "v7", "v8"] }
#xxhash-rust = { version = "0.8.8", features = ["xxh3"] }
xxhash-rust = { git="https://github.com/DoumanAsh/xxhash-rust.git", branch="neon", features = ["xxh3"] } src/main.rs: use base64::{engine::general_purpose::URL_SAFE, Engine as _};
fn main() {
let v: u64 = xxhash_rust::xxh3::xxh3_64(uuid::Uuid::new_v4().as_bytes());
let b64 = URL_SAFE.encode(v.to_le_bytes());
println!("Hello, world! {}", b64);
} both apps (with and without neon optimizations) are compiled with hyperfine output: $ hyperfine --warmup 1000 -N -u microsecond './public-id-neon-optimizations' ./public-id-no-optimizations
Benchmark 1: ./public-id-neon-optimizations
Time (mean ± σ): 728.7 µs ± 16.9 µs [User: 356.3 µs, System: 186.6 µs]
Range (min … max): 697.4 µs … 1069.2 µs 4060 runs
Benchmark 2: ./public-id-no-optimizations
Time (mean ± σ): 724.8 µs ± 15.2 µs [User: 355.4 µs, System: 184.2 µs]
Range (min … max): 692.9 µs … 920.6 µs 4129 runs
Summary
./public-id-no-optimizations ran
1.01 ± 0.03 times faster than ./public-id-neon-optimizations |
Well it is good that Mac has Neon enabled by default |
stats for 256Mb of random data:
|
Release 0.8.9 with Neon |
Could you please enable optimizations for macbooks by default as you've did for x86_64 CPUs
The text was updated successfully, but these errors were encountered: