Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use subgroup operations when possible #553

Open
beaufortfrancois opened this issue Aug 20, 2024 · 5 comments
Open

Use subgroup operations when possible #553

beaufortfrancois opened this issue Aug 20, 2024 · 5 comments

Comments

@beaufortfrancois
Copy link
Contributor

beaufortfrancois commented Aug 20, 2024

Subgroups can substantially enhance performance and adaptability for machine learning tasks on GPUs. Since they're now available on origin trial, it means https://webllm.mlc.ai/ could take advantage of them.

I'm not sure what is needed yet to make it work... I assume some work in Apache TVM as well.

I highly recommend you check out the quick-start guide at https://developer.chrome.com/blog/new-in-webgpu-128#experimenting_with_subgroups. For info, only subgroupBallot and subgroupBroadcast are there for now but more built-in functions such as subgroupAdd, subgroupAll, subgroupElect, subgroupShuffle will be added in a near future.

@beaufortfrancois
Copy link
Contributor Author

@CharlieFRuan @tqchen What are your thoughts on this?

@tqchen
Copy link
Contributor

tqchen commented Sep 3, 2024

This is great, subgroup shuffle can be useful for reduction operations. We did have warp shuffle support for metal backend, so maybe we can try add codegen backend for webgpu

@beaufortfrancois
Copy link
Contributor Author

The following subgroup shuffle functions are actually in Chrome 129 (currently beta):

  • subgroupShuffle(value, id): Returns value from the active invocation whose subgroup_invocation_id matches id.
  • subgroupShuffleXor(value, mask): Returns value from the active invocation whose subgroup_invocation_id matches subgroup_invocation_id ^ mask. mask must be dynamically uniform.
  • subgroupShuffleUp(value, delta): Returns value from the active invocation whose subgroup_invocation_id matches subgroup_invocation_id - delta.
  • subgroupShuffleDown(value, delta): Returns value from the active invocation whose subgroup_invocation_id matches subgroup_invocation_id + delta.

@beaufortfrancois
Copy link
Contributor Author

@tqchen @CharlieFRuan Is this being implemented in Apache TVM?

@CharlieFRuan
Copy link
Contributor

Hi @beaufortfrancois Really appreciate the info and suggestions! We think it is a good idea to have it implemented in the TVM flow. Unfortunately, we are a bit out of bandwidth as of now. We'll revisit in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants