-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add simd example #13
base: main
Are you sure you want to change the base?
add simd example #13
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, came across by accident.
Looks like SIMD mul
technically processes one element, usually for matrices it is easier to write a vector dot product and then build the rest on top of it.
Additionally, floating point arithmetic is more interesting, as floating operations are more expensive.
examples/rust/simd/src/lib.rs
Outdated
fn mul(a: u64, b: u64) -> u64 { | ||
let va: v128 = u64x2_splat(a); | ||
let vb: v128 = u64x2_splat(b); | ||
let c = u64x2_extract_lane::<1>(i64x2_mul(va, vb)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is technically a scalar multiplication - it fills all lanes with the same value and then extracts just one value out of the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for your review. I wasn't familiar with simd and made a mistake. Would appreciate feedbacks on the new code.
examples/rust/simd/src/lib.rs
Outdated
fn dot(a: Vec<u64>, b: Vec<u64>) -> u64 { | ||
assert!(a.len() == b.len()); | ||
let mut sum: u64 = 0; | ||
for i in 0..a.len() { | ||
sum += Self::mul(a[i], b[i]); | ||
} | ||
sum | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dot product is the smallest unit of work in matrix multiplication that can be implemented in SIMD, it usually works by taking N
worth of elements from the first array and second array, multiplying them via SIMD, then adding N
results to the intermediate vector sum (N
is number of lanes). Intermediate sum is the added up at the end, also for input sizes not divisible by N
the remainder needs to be calculated manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Please let me know if anything I could do better. I assume floating point implementation should be similar (please let me know if it isn't) so I will update floating point examples once this is ok :)
u64x2-scalar-mul: func(a: u64, b: list<u64>) -> list<u64> | ||
u64x2-dot: func(a: list<u64>, b: list<u64>) -> u64 | ||
u64x2-inner: func(a: list<u64>, b: list<u64>) -> list<u64> | ||
u64x2-mat-mul: func(a: list<list<u64>>, b: list<list<u64>>) -> list<list<u64>> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than using list you should use singlestore compatible packed 64 bit vectors:
use core::arch::wasm32::*; | ||
|
||
impl simd::Simd for Simd { | ||
fn u64x2_scalar_mul(a: u64, b: Vec<u64>) -> Vec<u64> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add docstrings to each function explaining it's purpose
Implements #11