Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WASM32-WASI NN support #2520

Open
profitgrowinginnovator opened this issue Nov 21, 2024 · 6 comments
Open

WASM32-WASI NN support #2520

profitgrowinginnovator opened this issue Nov 21, 2024 · 6 comments

Comments

@profitgrowinginnovator
Copy link

Feature description

There are several requests for WASM (webworkers,...). Please add WASM32-WASI with NN WIT support

Feature motivation

Solutions like WASMEdge allow for LLM inference (https://wasmedge.org/docs/develop/rust/wasinn/llm_inference/), so it would be excellent if any Burn LLM model could be compiled into a WASM32-WASI with NN WIT support and ran in serverless, edge and other constrained environments.

(Optional) Suggest a Solution

A wasi-nn feature allows a cargo build to target wasm32-wasip2 and NN WIT

@antimora
Copy link
Collaborator

Yes, I think it'd be useful.

So basically it would mean to implement a new backend that can use wasi-nn apis and have the code compiled to wasm32?

Linking wasi-nn https://github.com/WebAssembly/wasi-nn/tree/main

@profitgrowinginnovator
Copy link
Author

Yes you should be able to add the target wasm32-wasip1/2 and get a wasm which you can run for instance on wasmedge

@nathanielsimard
Copy link
Member

What are the benefits over building the ndarray backend or the wgpu backend for wasm? Just trying to get the benefits of targeting wasm32-wasip vs wasm32-unknown-unknown.

@profitgrowinginnovator
Copy link
Author

Great question. Wasm32-unknown-unknown is mainly used inside browsers where access to file systems is very restricted. Wasi allows the use of WIT, standardised interfaces. Some are for IO, key-value, cryptography and one is especially interesting for AI, the wasi-nn [neural networks]. This interface allows access to ML accelerators, e.g. OpenVINO, CUDA,... which can make the code run exponentially faster. WASI is made for device and cloud deployments where the unknown one is mainly for browsers. Hopefully that makes sense now. You might want to support wasip2 directly although nn is still pending.

@nathanielsimard
Copy link
Member

We're unlikely to create another backend using an external execution engine, as most of our efforts are focused on developing our own optimizations, kernels, and compiler tools. If there is an API to support GPU execution (SPIRV, CUDA, Metal, etc.) then we could target those in CubeCL.

@antimora
Copy link
Collaborator

It would worth coming back to this when wasi-nn matures and becomes generally available on major cloud services. I do see benefits deploying models to the edge compute easily. However, it's possible that the cloud services might just expose GPU execution (SPIRV, CUDA, Metal, etc.) APIs directly instead of having a new API layer such as wasi-nn. In which case, CubeCL will be more useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants