Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching look up table (LUT) in work-group local memory #4

Open
wants to merge 3 commits into
base: parallel
Choose a base branch
from

Conversation

itzmeanjan
Copy link
Owner

Harpocrates SYCL kernels can now process input

  • ~18x faster, when running on Intel Iris Xe MAX Graphics 🔥
  • ~10% faster, when running on Nvidia Tesla V100 GPU
  • ~19% faster, when running on Intel UHD Graphics P630

because often accessed (inverse) look up table (LUT) is first explicitly cached in work-group local memory, which makes it cheaper to access for all work-items of certain work-group. Work-group leader takes up the responsibility of explicitly copying look up table ( of 256 -bytes ) to faster work-group local memory & until that operation is finished all other work-items wait for work-group leader.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant