Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rodinia/gaussian sycl and ndpx implementation #307

Merged
merged 4 commits into from
Nov 4, 2023

Conversation

roxx30198
Copy link
Contributor

@roxx30198 roxx30198 commented Oct 21, 2023

Added the sycl and numba-dpex implementation for gaussian elimination benchmark for rodinia.
Runtime for 100x100 matrix with fp64 on cpu:
sycl:
max execution times: 12.190916ms (12190916 ns)
min execution times: 6.433963ms (6433963 ns)
median execution times: 7.595746ms (7595746 ns)
repeats: 10
numba_dpex_k:
max execution times: 3.994273296s (3994273296 ns)
min execution times: 2.814416425s (2814416425 ns)
median execution times: 3.585554875s (3585554875 ns)
repeats: 10

Copy link
Contributor

@adarshyoga adarshyoga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks in good shape!

@roxx30198 roxx30198 force-pushed the gaussian branch 2 times, most recently from 640d447 to 77fafac Compare October 25, 2023 20:41
@adarshyoga
Copy link
Contributor

median execution times: 7.595746ms (7595746 ns)
repeats: 10
numba_dpex_k:
max execution times: 3.994273296s (3994273296 ns)
min execution times: 2.814416425s (2814416425 ns)
median execution times: 3.585554875s (3585554875 ns)

These numbers look scary. 500X slowdown in numba_dpex over sycl.
I think we should carefully review the range selection in both implementations. I suspect the parallelism might be different.
I see that sycl impl is using range multiplication. Since range arithmetic is not yet supported in numba-dpex, you are manually computing the range in the numba-dpex impl. We need to make sure that there is not mismatch in the range used in the two impls.

@ZzEeKkAa
Copy link
Contributor

ZzEeKkAa commented Oct 30, 2023

@roxx30198 please split commits so one commit is responsible for infrastructure changes and another one is for the gaussian implementation

And let's set the input the way dpex implementation does not exceed 1s for S input.

@adarshyoga does input sizes for M and L sounds reasonable for you?

@roxx30198 roxx30198 force-pushed the gaussian branch 2 times, most recently from 0074445 to 1820e49 Compare October 30, 2023 21:00
@adarshyoga
Copy link
Contributor

@adarshyoga does input sizes for M and L sounds reasonable for you?

We would need to experiment with this workload to arrive at an appropriate M and L. I think we can merge this PR with the current values of M and L. We can change them as we do deeper analysis into this workload and Rodinia, in general.

@roxx30198 roxx30198 force-pushed the gaussian branch 8 times, most recently from 4179cf5 to 5bec653 Compare November 1, 2023 21:05
.pre-commit-config.yaml Outdated Show resolved Hide resolved
@roxx30198 roxx30198 force-pushed the gaussian branch 7 times, most recently from 6855db3 to 263776f Compare November 3, 2023 16:38
@ZzEeKkAa ZzEeKkAa force-pushed the gaussian branch 2 times, most recently from 0ba2a06 to 8da8d99 Compare November 3, 2023 23:41
Copy link
Contributor

@ZzEeKkAa ZzEeKkAa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@ZzEeKkAa ZzEeKkAa enabled auto-merge November 3, 2023 23:43
@ZzEeKkAa ZzEeKkAa merged commit 07f59c9 into IntelPython:main Nov 4, 2023
38 checks passed
@roxx30198 roxx30198 deleted the gaussian branch November 4, 2023 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants